Dilated Semantic Segmentation for Breast Ultrasonic Lesion Detection Using Parallel Feature Fusion

Breast cancer is becoming more dangerous by the day. The death rate in developing countries is rapidly increasing. As a result, early detection of breast cancer is critical, leading to a lower death rate. Several researchers have worked on breast cancer segmentation and classification using various imaging modalities. The ultrasonic imaging modality is one of the most cost-effective imaging techniques, with a higher sensitivity for diagnosis. The proposed study segments ultrasonic breast lesion images using a Dilated Semantic Segmentation Network (Di-CNN) combined with a morphological erosion operation. For feature extraction, we used the deep neural network DenseNet201 with transfer learning. We propose a 24-layer CNN that uses transfer learning-based feature extraction to further validate and ensure the enriched features with target intensity. To classify the nodules, the feature vectors obtained from DenseNet201 and the 24-layer CNN were fused using parallel fusion. The proposed methods were evaluated using a 10-fold cross-validation on various vector combinations. The accuracy of CNN-activated feature vectors and DenseNet201-activated feature vectors combined with the Support Vector Machine (SVM) classifier was 90.11 percent and 98.45 percent, respectively. With 98.9 percent accuracy, the fused version of the feature vector with SVM outperformed other algorithms. When compared to recent algorithms, the proposed algorithm achieves a better breast cancer diagnosis rate.


Introduction
There are numerous cancer types in the world, with breast cancer becoming the leading cause of death in 2020. In 2020, there were 2.2 million new cases of breast cancer reported worldwide . According to the Global Cancer Observatory, 0.68 million people died from breast cancer worldwide, with Asia being the most affected region with a 50.5% ratio [1]. Breast cancer is often not diagnosed until it is advanced because people in middleincome countries have less resources. Prevention strategies, on the other hand, can reduce the risk of death. Indeed, because it has such a negative impact on women's health, this cancer must be detected in its early stages.
Breast cancer is diagnosed using a variety of imaging modalities. Deep learning and machine learning have been applied in a variety of applications, including renewable energy [2], medical imaging [3], cloud computing [4], agriculture [5,6], fishery [7], cybersecurity [8], and optimization [9]. Medical imaging has seen many advancements in recent years, resulting in non-invasive imaging modalities. Ultrasound, mammography, and magnetic resonance imaging (MRI) are all common medical imaging techniques. Ultrasonic imaging is one of the best techniques because it does not use harmful radiation that can harm the body. It is also sensitive to dense breast masses, which leads to improved detection of cysts from solid tumors, which are usually difficult for the mammography technique to detect [10].
For precise resolution, the frequency range in breast cancer biopsy is between 30 and 60 MHz. The previous consideration for using ultrasonic technology was that it does not result in tissue heating, whereas other techniques, such as X-rays, can cause cancer. Medical professionals prefer a fair amount of information when it comes to exposing patients to ionizing radiation [11].
Furthermore, ultrasonic technology is regarded as a necessary alternative to mammography because it is less expensive, more accurate, more sensitive, less invasive, and takes less time. Sonographers frequently perform manual diagnoses using ultrasonic reports, which is time consuming and can compromise the results if they are inexperienced. Some breast lesions must be identified in these images. To detect these lesions, numerous studies have used segmentation and localization.
Several deep learning algorithms have been used to segment medical images [12]. U-Net is one of the most successful deep-learning-based image segmentation approaches for medical image analysis. It employs the downsampling to upsampling approach for skip connections. The proposed study used this approach in its framework for segmentation purposes, which was inspired by in-depth semantic segmentation. Following lesion segmentation, feature extraction is an essential part of classification because the acquired meaningful features lead to the correct diagnosis of breast cancer [13]. While numerous research papers on feature extraction and classification have been published, radiologists still require reliable, informative potential features for a more powerful cancer diagnosis due to computer experts' lack of domain knowledge [14].
Artifacts, speckle noise, and lesion shape similarities can be found in ultrasonic images. Breast lesion segmentation remains an unsolved problem as a result of these difficulties [15][16][17]. Existing studies lack vigor, intensity inhomogeneity, artifact removal, and precise lesion segmentation [15,16,18,19]. Because of the deep convolutional process, which extracts rich feature vectors, deep learning-based approaches for semantic segmentation and classification have gained popularity [19,20].
As a result, the aforementioned issues must eventually be considered for automated breast cancer diagnosis in order to improve methods, efficiency, and accuracy [21]. The proposed framework for semantic segmentation of ultrasonic breast images employs dilated factors in convolutional layers.
The main contributions of this study are given below: • A dilated semantic segmentation network with weighted pixel counts. • The segmentation method enhanced with the erosion operation. • Dense features are extracted from the proposed 24-layer CNN, which uses transfer learning to transfer the enriched features to the next layer. • The fusion of the obtained feature vectors from the segmented image through the Di-CNN.
The rest of the article is structured as follows: Section 2 includes recent related work and the detailed literature review, along with the description of modalities and results. Section 3 presents the materials and methods, including the proposed segmentation approach with the dilated CNN framework and a flowchart of the proposed framework. The experimental results and their analysis are given in Section 4. Finally, the conclusions are made in Section 5.

Related Work
Several computer-assisted diagnosis systems were developed in previous studies. Some employ segmentation of a specific region of lesions, while others employ direct classification without segmentation. Various segmentation studies have been conducted in the past. A context level set for breast lesion segmentation is proposed. To obtain discriminative information, low-level features are used.
The semantic information is gathered using U-net, and contextual features are then added to create a new energy term [22]. The encoded U-Net approach is used for breast tumor segmentation, and it achieves nearly 90.5 percent dice score on the 510 image dataset. To demonstrate the dominance of attention encodings, the salient attention layers are compared without the salient layers' approach.
The Hilbert transform was used to reproduce B-mode features from raw images, followed by the marker-controlled watershed transformation to segment the breast cancer lesion [23]. The techniques used, which were solely focused on texture analysis, were very susceptible to speckle noise and other objects.
Following the extraction of shape-based and texture features from the breast lesion, a hybrid feature set was created. Various machine learning classifiers were used to distinguish between cancerous and benign lesions.
The authors proposed a novel second-order subregion pooling network [24] for improving breast lesion segmentation in ultrasound images. Each segmentation network encoder block includes an attention-weighted subregion pooling (ASP) module, which refines features by combining feature maps from the entire image and segmenting from subregions. Furthermore, a directed multi-dimension second-order pooling (GMP) frame was trained in each part of the region to recognize efficient second-order covariance projections by leveraging additional knowledge and multiple feature dimensions.
Similarly, Ref. [25] proposed an effective method for extracting and selecting features. It was suggested that clinical efforts be reduced without segmentation; effective feature selection can improve classification. It proposed using histogram pyramids with a correlation-based features selection approach to extract oriented gradient features. Following that, by combining learned weights with it, the classification was carried out using minimal sequential optimization. The achieved classification sensitivity was 81.64%, with a specificity of 87.76% [25]. A deep learning-based study used privately collected ultrasonic images to feed shape and orientation scores to the quantitative morphological score. Finally, logistic regression-based analysis was carried out, where validation results show good performance in distinguishing ultrasonic breast masses from unnecessary biopsies [26].
Another study used large numbers of data to train deep learning models with B-mode (binary mode) and color Doppler-based dual models. Three databases were selected with the assistance of 20 radiologists for comparison between expert diagnosis and intelligent model-based analysis. The proposed algorithm outperformed all datasets with which it was tested and achieved 0.982 of Area Under the Curve (AUC) [27]. To target multiview features, an updated Inception-v3 model is proposed that employs coronal views. Following that, the proposed study compared results with HOG and PCA-based methods, which yielded promising results [28].
A second branch temporal sequence network is proposed, which employs two types of data: binary mode and contrast-enhanced ultrasonic image data. Later, this temporal sequence model employs a shuffling method to shuffle sequences rather than to enhance temporal information. The study produced better results than previous studies, according to the findings [29].
Another study employed cutting-edge CNN variants such as InceptionV3, ResNet50, VGG19, and VGG46, with Inception-v3 achieving the most significant results when compared to sonographer interpretation [30]. A morphological and edge-features analysis with a combinational approach is proposed, with a primary focus on the sum of curvatures based on histograms of their shapes. To classify with single morphological features and incorporate edge features, Support Vector Machine (SVM) is used [31]. Some previous works with their dataset details and results are given in Table 1. Biomarkers-based research is presented in [33] with optimal parameters of tissue elasticity and indirect pathology. Wrapper feature selection with empirical decomposition is also used for feature reduction. Finally, the estimation model takes into account the increasing transformation of carcinogenic tissues. For shape estimation using ultrasonic images, a mathematical model [35] is proposed. The tumor location is combined with an ultrasonic array in this model to extract breast mass information from the image. Tumor size is recognized and calculated in each image, with higher frequency use recommending future work to achieve a higher visual resolution of images. A parallel hybrid CNN approach to classifying into four categories is proposed in [36]. The targeting data remain the same in this method, but the training of the proposed CNN is switched from the same domain data to a different domain data. Data augmentation is used to overcome the overfitting effect. The patch-wise and full-image-wise classifications are reported with 90.5% and 97.4% accuracy.
The authors introduced semantic segmentation [37] with patch merging for segmentation, where the region of interest (ROI) is cropped using diagonal points. After enhancing patches with various filters, superpixels and a bag-of-words model are used to obtain features. To begin, the classification is performed using a neural network, which is then used to improve results. Following that, the k-nearest neighbor (KNN) method is used to improve overall classification performance. The authors propose a U-net-based segmentation in [38] for tissue type classification. The Gauss-Newton Inversion method is then used to reconstruct masses in order to target ultrasonic and dielectric properties. When compared to previously proposed tissue type classification studies, the proposed algorithm outperforms them all.
A reconstruction method based upon the natural frequency of ultrasonic images is proposed in [34]. Raw data are used to create small patches and amplitude samples. For the classification of relevant instances, radio frequency data amplitude parameters known as Nakagami are used. Using contrast-enhanced perfusion and patterns, the author proposed breast cancer molecular subtypes [39]. The proposed improvement is important in the differential diagnosis of breast cancer subtypes. Statistical analysis is used to evaluate targeted subtypes using decision-making variables such as means and standard deviation. For the analysis of breast ultrasonic image features, a privately collected dataset of patients is used. The features are extracted using sonographers.
A multivariate analysis is performed in [40] to diagnose breast cancer to get the confidence interval of 95% of the area under the curve (AUC), which is shown for the primary cohort as 0.75 and 0.91 for the external cohort. This study achieved 88% sensitivity. Furthermore, the authors used real-time burn ultrasonic image classification using the texture GLCM features [41]. Different temperature level is used for pair-wise classification of binary mode slices.
Another study uses Faster-RCNN for lesion localization [19], where in the absence of available datasets, transfer learning-based features are used. To evaluate the segmented lesions, detection points and intersection over union (IoU) are used. The RGB approach improves dataset recall more than the other approaches. An extensive texture analysis discriminant, and without discriminate analysis features of GLCM, AGLOH for comparative analysis of breast cancer classification is presented in [42].
The authors carried out ensemble learning using ResNet, VGGNet, and DenseNet [43], achieving 94.62% accuracy with 92.31% sensitivity. Another study uses a similarity checking network named Siamese CNN. Similarly, the chemotherapy-based general features are used from ultrasonic images with logistic regularization [44]. The AUC is used to evaluate the proposed method, which achieved 0.79 without prior and 0.84 with prior. The study showed an improvement compared to the morphological features.
For breast cancer classification based on histology images, the authors developed a method based on deep CNNs [32]. The dataset used for the task was the ICIAR 2018 dataset, which has Breast Cancer Histology Images with a hematoxylin and eosin-stained breast histology. Several deep neural network architectures and a gradient boosted trees classifiers were used in their approach. They recorded an accuracy of 87.2% for the fourclass classification task. At the high-sensitivity operating stage, they recorded 93.8%, 97.3% AUC, 96.5% sensitivity, and 88.0% specificity for a two-class classification task to detect carcinomas.

Methodology
In the recent era, many studies have suggested employing deep learning as it performs well in medical imaging diagnosis and other image recognition challenges [10,20]. A new dataset of ultrasonic imaging with given label masks is used to solve segmentation and classification challenges [45]. The study used dilated factors in convolutional layers to convolve more appropriately than the simple operation of convolution. The proposed Dilated Convolutional Neural Network (Di-CNN) accurately detects ultrasonic breast tumors. However, the results obtained using simple CNN were not promising to get the same tumor. Therefore, we employed the erosion operation with disk type structuring element on the image with an optional area size operation to get the same tumor. A flow chart of the proposed framework is given in Figure 1.
With a proposed 24-layer CNN, the final segmented tumors are used to extract features. To obtain the transfer learning-based features, another pre-trained ImageNet variant, called DenseNet201, is used. To classify the malignant and benign nodules, both feature extraction approaches are used with and without fusion.

Segmentation
The ROI is extracted from an input image during segmentation. Pixel-level classification, also known as semantic segmentation, is used to extract tumors from given ultrasonic images. Feature vectors are extracted from pixel intensity levels in semantic segmentation. To achieve the goal of tumor extraction, the proposed Di-CNN is further supported by morphological operations.

Dilated Semantic Convolutional Neural Network
The proposed framework employs the regular convolutional operation with the dilation factor as a support. This dilation is increased in each subsequent convolutional layer to obtain more accurate spatial information about tumors and their surroundings. Figure 2 depicts the proposed dilated semantic CNN architecture. In the proposed Di-CNN, we used five main convolutional blocks, each of which consists of one convolutional layer with batch normalization and a ReLU activation layer. Dilated factors are enriched in these convolutional layers. In even numbers, the dilation factor is increased up to 16. The convolutional kernel's filter size is 3 × 3 with padding of the same for all orientations. The total number of filters that are used in the layer is 64. The convolutional operation is calculated as usual by convolving each 3 × 3 patch with an increase in the number of rows per column until the digital lattice finishes its end pixel. These convolutional operations are aided by dilation, which employs spacing to cover all spatial information. Dilation factor 1 denotes that the 3 × 3 size factor is added centrally in the first convolutional block.
Progress in factor from 1 to 2 gains space in the 3 × 3 kernel to the image corner's extent centrally. However, this spacing is increased up to 16 with the support of batch normalization and ReLU activations. The proposed CNN applies 500 epochs for training the model. It takes a considerable time while training. The dilation factor is proposed by [46], and it uses the following Equation (1). The dilated factor is given as (l) × (k). The l represents the number or value of dilation as a factor assigned, where f is a discrete function with filter size. However, the results of the proposed Di-CNN are shown in Figure 3. The input image has two areas with two colors, as shown in Figure 3. The predicted tumor class pixels are represented by the dark blue color. However, some images in the testing data contain some noise in the predicted results, which were later covered by the erosion operation.

Erosion
In image processing, erosion is a fundamental operation. It is used to target some specific morphology in image structuring. In our case, the tumor resembles the shape of a disk in most images, which resulted in promising results in predicting the same tumor. The disk-type erosion is used to obtain a radius value in order to centrally map a disk-like object on a given digital binary image. It removes all other noise from the image rather than only that of disk shape. The erosion equation is shown in Equation (2) [47]. The structure of the disk is represented in Figure 4.
I in Equation (2), the input image has a lesion in it where the E str is the structuring element that has the disk-like structure. The erosion of I by E str shows the lesion containing in it where z is the set of points. The shallow image of the disk-type element is shown in the input image, which uses a radius to specify the disk size to map on the image. The background image is a binary image in which white pixels with one value form a disk-like structure. To remove noise, this structure is later mapped to predicted tumor pixels. Figure 5 depicts the results of the erosion operation. The erosion operation begins at the top left side of the given binary image and maps the structuring element to the right by increasing row seeding. A similar shape with some intensity holes is required to eliminate the white pixels. Image noise is removed in this manner. Tumors of the disk type are predicted shaped or actual ground truth labeled. The image was not clear throughout due to the use of erosion to remove such significant areas. The proposed framework may use the area size operation and some of the images if desired. The final tumor segmentation is outlined in the red to map on authentic images to show actual and predicted lesions on images. Figure 6 shows some of the image results from input to final tumor. Figure 6 representing five columns. The first column shows the original images, while the second columns contain ground truth labels images of the dataset given. The third column shows the results predicted by Di-CNN where some noises were also predicted, as shown in Figure 4. To remove the noise, an erosion operation with a radius range of 5-20 is performed to cover various types of tumors from both classes. The morphology of the exact tumor may be disturbing, but it is preferable to image noise. The fourth column depicts the final eroded mask, which was later mapped onto the original image to obtain the tumor portion. The fifth column shows the final outline of eroded lesions in original images.
The final eroded masks were used to process the feature extraction as segmented lesions. Each aspect of the features, such as shape, localization, and intensity type, was covered by the in-depth learning features.

Features Extraction
The features extraction phase is an important step in obtaining critical information about the malignant and benign classes. Both classes, however, have distinguishing features. Many previous works on feature extraction have been developed, but they may not cover every aspect of geometry, intensity, and localization. Deep learning is used in the proposed study for this purpose. Two types of approaches are used to extract these features.

DenseNet201 Transfer Learning Features
A pre-trained network with 201 layers superimposed 1000 different types of objects. It can be transferred from one dataset to another of the same or lesser size and used to classify objects. The 201-layer deep-learning model, on the other hand, can activate some other image datasets using deep, dense layer models.
Similarly, the proposed study makes use of transfer learning on the DenseNet201 network. The fully connected layer fc1000 is used to extract features from an ultrasonic image dataset. Because there were 647 images, both malignant and benign, the final vector for both images is 647 × 1000. These row-wise features are used to create a final matrix for classification.

Convolutional Neural Network
The proposed study offered a 24-layer architecture trained on training images of both classes and later on testing data to use more in-depth learning of lesions images. Figure 7 depicts the architecture of the proposed framework, including the weights of the layers. Convolution, batch normalization, and ReLU are among the five main blocks in the proposed CNN. Each block output is subjected to additional max pooling. The convolutional kernel is defined as 3 × 3, with the number of filters in each block increasing from block-1 to block-5. The block-1 convolutional layer has 16 filters with the same padding on all four sides (left, right, top, and bottom). This number of filters is higher than N + N.
If several filters are used as N, then the number of filters is multiplied by the number of blocks to get the following number of filters for blocks 2, 3, 4, and 5. Therefore, the number of filters used are 16, 32, 64, 128, and 256. However, the five max pooling operations are added at the end of each block. The max-pooling operation remains the same as the 2 × 2 kernel with stride = 2.

Feature Fusion
Both extracted feature vectors are then fused into a single feature vector using parallel concatenation. The 647 × 1000 feature vector f 1 of dense features and second feature vector f 2 647 × 2 of proposed CNN "fc" layer activated vector are concatenated in parallel. Both feature vectors then finally make the final 647 × 1002 f 3 vector.

Classification
The proposed research utilizes the fused feature vector and fed into multiple machinelearning classifiers to classify the healthy and unhealthy images. The f 1 vector classification employing 10-fold cross-validation, 70%-30%, and 50%-50% splits is performed. Further, the f 2 vector is fed to the same classifiers for classification, and then lastly, both vectors undergo fusion using the f 3 vector. Moreover, all approaches are validated using three cross-validation approaches.

Results and Discussion
The proposed study made use of a dataset that was divided into three categories: benign, normal, and malignant. Because there was no mask label in the standard images, two malignant and benign images were used. In the tumor segmentation, a total of 647 images from both classes were used, and 133 normal instances were not included during the segmentation. All images and masks used were not the same in size. Therefore, data augmentation was utilized to make equal sizes of all images as 512 × 512. The employed dataset is publicly available at [45]. The dataset description is given in Table 2.

Di-CNN Evaluation
Use of a 19-layer dilated enriched CNN is proposed for semantic segmentation. Three different types of networks were trained, and the best one was chosen because it outperformed the other two. In segmentation, the evaluation measures Intersection over Union (IoU), global accuracy, F1-score, and mean accuracy were used. Mean-IoU is the most relatable measure of a segmentation network; it is calculated by taking the intersection of binary classes in both masks and using ground truth labels on predicted labels pixel count. The Mean-IoU and Global Accuracy, on the other hand, produced good results, as shown in Table 3. The accuracy obtained using the proposed Di-CNN was 80.20% for two classes as background and tumor classes. The IoU of all images was scored as 52.89%, where the weighted IoU was used by giving weights to both classes, which allowed us to get more accurate results. The training parameters are further explained in Table 4. The calculation for IoU, accuracy, and BF-score measures are given in Equations (3)-(5) [48].
Intersection Over Union (IoU) = predicted mask actual mask predicted mask actual mask (3) All training parameters of the proposed Di-CNN (see Table 4) show a higher number of epochs and batch sizes, confirming the utility of Di-CNN. For the hyperparameter optimization, we used gradient descent with momentum as an optimizer. However, we used a variant of the proposed CNN to extract features from its fully connected layer.

DenseNet201 Activations Based Classification
Activations on a fully connected layer yielded a feature vector based on an augmented DenseNet201 image input layer dataset. Table 5 displays the results of various support vector machine variants classifiers. It can be seen that the effects of different pre-trained CNN classifiers using f 1 are significantly improved in classifying real positives in given datasets. This significant performance, however, is based on a single data split chunk. We used the K-fold crossvalidation to assess the validity of the proposed framework. The classifiers are validated using 10-fold cross-validation as shown in Table 6. The results of 10-fold cross-validation are the same as in the case with fewer testing data (refer to Table 5). However, to check other feature results, the single CNN activated features are tried first using the f 2 feature vector.

CNN Activations based Classification
The single f 2 feature vector was used on the same SVM variants using 10-fold and 70-splits to cross-check the proposed approach's performance. The results of the 70%-30% splits ratio are shown in Table 7. The proposed CNN activated feature layer-based features were used, which showed less accurate results than the results given in Tables 5 and 6. Cubic-SVM shows promising results in all approaches with better accuracy, sensitivity, Precision, and F1-Score. The 10-fold validation using the f 2 vector is shown in Table 8.  Table 6 also shows the lower performance of f 2 feature vector-based classification in all evaluation measures. The study used both f 1 and f 2 feature vectors to fuse in parallel to make a new feature vector. The reason for including both feature vectors is because the most valuable local features of dataset-specific are used in the proposed CNN as it trained for 500 epochs.

Parallel Fusion
The parallel concatenation of the f 1 and f 2 feature vectors resulted in a new f 3 feature vector of size 647 × 1002. The f 3 vector is further fed to all previously used SVM variants. The 70%-30% split-based and 10-fold cross-validation-based classifications are shown in Tables 9 and 10.  Tables 9 and 10). It shows that the proposed feature vectors are robust enough to identify benign and malignant tumors. However, the results in terms of AUC remain 1.00 for all cases in 10-fold cross-validation, as shown in Figure 8. The evaluation measures used for the classification were calculated using Equations (5)- (9).     All of the figures show excellent results for the correct diagnosis of benign cases, whereas malignant cases may require more information for classification using SVM variants. There were 647 total cases in diagonal, with 636 cases correctly diagnosed and 11 cases of a malignant class misclassified. However, when compared to previous studies, the proposed study achieves dominant results.

Discussion
The results presented in the preceding sections had a significant impact on the use of the proposed methodology for breast cancer detection. The proposed work has generated a lot of confidence due to its validation with 10-fold cross-validation. In addition, each feature vector is evaluated separately on different SVM variants. In classification using the first feature vector, the maximum accuracy (98.97%) was achieved by SVM-Medium-Gaussian with a 70%-30% split ratio to verify that the same vector was tested on 10-fold validation, which reduces the overall accuracy measure by a 0.25-0.5 range. However, to achieve better confidence about used dataset features only, a 24-layer CNN architecture was proposed with the same parameters mentioned in Table 3. However, CNN as a classifier reached up to 80% results only on validation data, which is not discussed where the activations on all data using its 'fc' layer were used, which shows higher results than when using the simple CNN features. This feature vector was tested on the same SVM variants with 70%-30% and 10-fold cross-validation. A comparison of Di-CNN with recent state-of-the-art algorithms on the same dataset is given in Figure 13. We can observe that (see Table 8) feature vectors extracted from CNN-activated layers exhibit intermediate recall rates on 10-fold cross-validation for almost all classifiers except SVM-Medium-Gaussian with 97.25% recall. This phenomenon indicates that SVM-Medium-Gaussian returns more relevant instances during validation than other classifiers. However, cubic SVM appeared as the worst performer in this part of the experiment, with 81.61% accuracy.
The maximum accuracy was 89.18%, with a 96.95% sensitivity value. However, the same feature vector was used for 10-fold cross-validation, which achieves 90.11% accuracy with a 97.25% sensitivity value. Moreover, to obtain both types of feature effects in order to become more confident in our results, the proposed study concatenates both feature vectors, which later pass on to the same classifiers. The same 70-30 splits were applied and achieved 98.97% accuracy with a 100% sensitivity value. The 10-fold validation achieved better results with a maximum accuracy of 98.76% with the same 100% sensitivity.
Similarly, we can see (see Table 6) that feature vectors extracted from DenseNet201 network using transfer learning display low precision (94.76%) with LSVM compared to the other classifiers. QSVM and cubic SVM outperformed other classifiers with a 95.71% precision rate each. All experimental results presented in Table 6 indicate that more of the classifiers detected relevant instances as a positive class than the negative target class.
Our results are compared with some current state-of-the art studies on breast cancer identification, demonstrating that the proposed research has better results than existing algorithms, as shown in Table 11. One study [14] worked on fuzzy interpolative reasoning and selected features using a feature-ranking technique, but it achieved 91.65% accuracy with less sensitivity. Similarly, a deep learning-based ultrasonic image classification, proposed by [22], used submodules with parameter selection to achieve a 96.41% accuracy. Another research study [31] used SVM, KNN, Discriminant Analysis, and random forest classifiers and achieved 82.69%, 63.49%, 78.85%, and 65.83% accuracy, respectively. The last research [43] in Table 9 used ensemble learning with CNNs such as DenseNet-X and VGG to identify breast cancer. It achieved 90.77% accuracy on ultrasonic breast images.
Feature vector fused from activated CNN and DenseNet201 (see Table 10) combined with SVM-Medium Gaussian shows the highest recall rate of 100% on 10 fold crossvalidation. All classifiers obtained the same recall rate except the cubic SVM with 99.55%.  [43] 90.77 96.67 89.00 DenseNet-121 [43] 88.46 83.78 90.32 NASNet [49] 94 --ResNet [49] 93 --Inception [49] 85 --VGG16 [49] 88 --CNN-AlexNet [49] 78 --SK-U-Net with fine tuning [50] 95.6 --SK-U-Net without fine tuning [50] 94.4 --Inception V3 [51] 75 As in Figure 9, the confusion matrix has 12 cases in total that wrongly predicted using Linear-SVM, wherein Cubic and Quadratic SVMs have a total of nine results of malignant class being predicted incorrectly. Figure 12 have shown 11 wrongly predicted results of the malignant class in the benign class using the QSVM classifier. Figure 10 represents nine wrongly predicted instances when the feature vectors were fed to SVM-Medium-Gaussian. The difference between misclassification results in Figures 9 and 11 is that, while both have the same number of misclassifications, in Figure 9, the wrongly predicted cases belong to both classes, while Figure 11 shows only one wrong class result, which can lead us to say that the prediction of correct positives increases in the proposed methods. From Figure 13, the RUSBoosted trees appeared as the second-best algorithm for ultrasonic breast image classification after Di-CNN with 96.60% accuracy.
The proposed framework's limitation could be its failure to use the feature selection technique. Both activated CNN feature vectors and transfer learning-based CNN feature vectors should be fed into the feature selection approach to eliminate the irrelevant features, resulting in a vector with only the most relevant feature, which can improve performance. However, we intend to cover feature selection techniques in future work.

Conclusions
Breast cancer is becoming increasingly lethal. In developing countries, the death rate is rapidly increasing. As a result, early detection of breast cancer is critical, resulting in a lower mortality rate. We used pixel-level semantic segmentation of ultrasonic breast lesions with dilated factors in this study. An ultrasonic imaging dataset based on masks was used. Following the segmentation phase, the extracted lesions were subjected to an erosion and size filter to remove noise from the segmented lesions when compared to ground truth masks. Finally, for transfer learning features, the DenseNet201 deep network was used, and for feature activations, a proposed CNN was used. Both individual and fusion-based feature vectors were validated using the SVM classifier variants on two validation techniques. However, the final comparison showed a greater improvement in terms of correctly identifying true positives. The accuracy of CNN activated feature vectors and DenseNet201 activated feature vectors combined with the SVM classifier were evaluated, achieving 90.11% and 98.45%, respectively. With 98.9% precision, the fused version of the feature vector with SVM outperformed other algorithms.
In the future, we intend to use more data to work on similar breast cancer identification. The proposed framework of CNN for semantic segmentation and classification may also be improved with the hyperparameter optimization.