Correlated-Weighted Statistically Modeled Contourlet and Curvelet Coefficient Image-Based Breast Tumor Classification Using Deep Learning

Deep learning-based automatic classification of breast tumors using parametric imaging techniques from ultrasound (US) B-mode images is still an exciting research area. The Rician inverse Gaussian (RiIG) distribution is currently emerging as an appropriate example of statistical modeling. This study presents a new approach of correlated-weighted contourlet-transformed RiIG (CWCtr-RiIG) and curvelet-transformed RiIG (CWCrv-RiIG) image-based deep convolutional neural network (CNN) architecture for breast tumor classification from B-mode ultrasound images. A comparative study with other statistical models, such as Nakagami and normal inverse Gaussian (NIG) distributions, is also experienced here. The weighted entitled here is for weighting the contourlet and curvelet sub-band coefficient images by correlation with their corresponding RiIG statistically modeled images. By taking into account three freely accessible datasets (Mendeley, UDIAT, and BUSI), it is demonstrated that the proposed approach can provide more than 98 percent accuracy, sensitivity, specificity, NPV, and PPV values using the CWCtr-RiIG images. On the same datasets, the suggested method offers superior classification performance to several other existing strategies.


Introduction
For both industrialized and developing nations, female breast cancer is a very pressing issue. According to a current report from the American Cancer Society's Cancer Statistics Center, there are expected to be 1,918,030 new cases of breast cancer in 2022, while there will be only 609,360 cancer deaths in the country overall, with 290,560 of those cases (or roughly 48 percent) being breast cancer cases [1].
Among the other imaging modalities, such as mammography and MRIs, one of the most promising techniques is breast ultrasonography (US) imaging for classifying breast tumors. Numerous studies have been conducted and are continually being conducted to increase the precision of classifying benign from malignant breast tumors automatically. The depth-to-width ratio, the normalized radial gradient, and the autocorrelation feature were three computer-extracted characteristics that were combined in 2002 by K. Horsch et al. [2] to detect breast cancers in the depth of lesion region. A computer-aided diagnostic (CAD) system was introduced in 2007 by Wei-Chih Shen et al. [3] The classification outcomes are estimated by mean values and standard deviations (SDs) of a shape, orientation, margin, lesion border, echo pattern, and posterior acoustic properties such as geometrical features. The accuracy is stated to be 91.7%, but in that study, from the healthy breast tissue, the lesion site was segmented both manually and automatically. In big data analysis of US images, manual lesion boundary detection can occasionally be challenging.
Statistical modeling is another imaging technique where various features are extracted from the statistical models, such as Gaussian or Nakagami images obtained from the original B-mode images, which is called parametric imaging [21,22], and has found satisfactory results in breast tumor classification. These statistical techniques were developed primarily to quantitatively simulate the sound waves dispersing through tissue, which can offer a deeper understanding of the system and more accurate features. More so than spatial domain visual ultrasound pictures, false positive (FP) and false negative (FN) results can be described statistically. Nakagami modeling was employed by Ming-Chih Ho et al. [23] to investigate the detection of rat liver fibrosis, which may not be the same as classifying breast tumors, but it does offer some support for the value of parametric imaging. The use of deep CNN, as a potential technique for the automatic interpretation of various medical image types, is a recent development in this field, enabling the quick and accurate identification of various medical diseases. Deep neural networks enable the development of automated medical solutions that are very efficient and very accurate, particularly for the automated categorization of breast tumors [24]. This is in contrast to conventional incorporated engineering-based methods, which depend on the accuracy of the feature extraction techniques for their resilience. Shear-wave elastography data were subjected to CNN and morphological information extraction methods by Zhou et al. [25] for the categorization of breast tumors. CNN was used for breast tumor categorization by Zeimarani et al. [26]; however, they applied it directly to breast ultrasound images. A generative adversarial network (GAN) and CNN were successfully used by Singh et al. [27] to separate and categorize breast tumors from ultrasound images. Ramachandran et al. [28] achieved a decent outcome in a small online dataset using a straightforward neural network that is inexpensive and simple to use. According to Hou et al. [29], a gadget itself without using a cloud-based server, a CNN classifier, may be trained using a model of a pre-trained AI neural network. The research of Shin et al. [30] showed that a neural network with quicker R-CNN and ResNet-101 was possible. A technique for converting US to RGB and finetuning it through back-propagation was published by Byra et al. [31]. With multiple-scale kernels and skip connections, Qi et al. [32] demonstrated a unique deep CNN technique. Deep neural network approaches, however, do not consider statistical aspects or traits.
In this study, it is demonstrated that an extremely successful model is the Rician inverse Gaussian (RiIG) distribution, which comprises the statistics of the contourlet and curvelet coefficient images [33]. It has been demonstrated that the features derived by the CNN network from the RiIG statistically modeled (i.e., parametric) images, compared with features extracted from US B-mode images, provide a higher level of accuracy for breast tumor classification. Firstly, the contourlet and curvelet transform are applied to the US B-mode images to obtain contourlet and curvelet coefficient (C) images. The next step is to create contourlet or curvelet parametric (CP) images by substituting a pixel from the coefficient (C) image with the estimated RiIG parameter (δ), which is carried out over a local neighborhood of the corresponding pixels with the requested parameter taken into account at the neighborhood's center. The parameter values that produce the CP image are, therefore, transformed (δ-mapped) from the pixel values. To enhance the precision of the statistical characteristics in classification, correlated-weighted contourlet (Ctr)-or curvelet (Crv)-transformed parametric (CWCtrP or CWCrvP) images are introduced. The contourlet or curvelet parametric (CP) images are correlated with their matching contourletor curvelet-transformed coefficient (C) images to create the CWCtrP or CWCrvP images. As a result of applying correlation with the relevant contourlet or curvelet coefficient (C) images, weights were assigned to each parameter of CP images; the term "correlatedweighted" is being utilized in this system. In this work, the CWCtrP and CWCrvP images are used to classify breast tumors in a deep CNN architecture. The proposed methods subject fully connected layers and a variety of machine learning classifiers, including the support vector machine (SVM), k-nearest neighbor (KNN), random forest, etc., to the features extracted from the proposed deep CNN's global average pooling layer. The features recovered from the database US B-mode images, parametric (P) images, contourlet-converted (C) images, contourlet parametric (CP) images, and weighted contourlet parametric (WCP) images, have previously been demonstrated to have the highest level of accuracy [34]. So, only the correlated-weighted version of the contourlet-and curvelet-transformed parametric (CWCtrP and CWCrvP) images are examined in the proposed method rather than P, C, and CP images. In this context, the CWCtrP and CWCrvP images consisting of six contourlet and curvelet sub-band concatenated coefficients are fed to the deep CNN network separately for a comparative study of the two multi-resolution transform domain's image performance. The pre-trained networks cannot be used with our six-channel stack of contourlet and curvelet transform domain CWCtrP and CWCrvP coefficient images since they are made for one-or three-channel visual images with spatial dimensions. As a result, a custom-made deep CNN architecture is provided. On three publicly accessible US image databases for identifying breast tumors, the performance of the prior classifiers is evaluated and compared with the state-of-the-art techniques.
The following list summarizes this work's significant contributions: • pre-processing (such as edge enhancement, speckle reduction, compressed dynamic range, persistence, etc.). As a result, additional pre-processing steps for removing different sounds, artifacts, and anomalies are not required. The following sub-sections outline the essential processes for preparing the database images for the dual input CNN architecture. Since the standard discrete wavelet transform (DWT) domain includes only horizontal, vertical, and diagonal dimensions, it offers limited dimensional information. The contourlet transform, on the other hand, supports a wide range of arbitrary forms and contours that are not restricted to three dimensions. The normalized B-mode images are transformed using the contourlet transform to use a filter bank to separate the directed and multiscale decompositions [7], as illustrated in Figure 1. 1000 malicious instances in each of the three databases. There were 2000 images per database, which comprised the 6000 total augmented images at that time. To equalize the number of benign and malignant cases, data augmentation was carried out primarily by expanding the sample numbers needed to train the neural network and eliminate the class disparity. As is typical of clinical scanner outputs, the images in these earlier databases had previously undergone pre-processing (such as edge enhancement, speckle reduction, compressed dynamic range, persistence, etc.). As a result, additional pre-processing steps for removing different sounds, artifacts, and anomalies are not required. The following sub-sections outline the essential processes for preparing the database images for the dual input CNN architecture. Since the standard discrete wavelet transform (DWT) domain includes only horizontal, vertical, and diagonal dimensions, it offers limited dimensional information. The contourlet transform, on the other hand, supports a wide range of arbitrary forms and contours that are not restricted to three dimensions. The normalized B-mode images are transformed using the contourlet transform to use a filter bank to separate the directed and multiscale decompositions [7], as illustrated in Figure 1.

Curvelet Transform
The best sparse representation of objects with edges and contours is provided by the curvelet transform. In contrast to the isotropic components of wavelets, the needleshaped elements of the curvelet transform have extremely high directional sensitivity and anisotropy. Later, the second-generation curvelet transform was demonstrated to be a very effective tool for a variety of applications, including partial differential equations, seismic data exploration, image processing, and fluid dynamics (PDEs). Periodization was used to treat image borders in earlier iterations of the transform previous version. The data have been properly arranged in this case, and the discrete cosine domain will be tiled instead of the discrete Fourier domain, which is a significant change. A discrete filter bank structure, called smooth images with piecewise smooth contours, can be handled via contourlets. Structures resembling curvelets in the continuous domain can be coupled to this discrete transform. As a result, the contourlet transform can be considered a discrete version of a certain curvelet transform. Figure 2 illustrates how curvelet constructions connect to a polar coordinate-based partition of the 2-D frequency plane [38] and call for a rotational operation. The best sparse representation of objects with edges and contours is provided curvelet transform. In contrast to the isotropic components of wavelets, the needleelements of the curvelet transform have extremely high directional sensitivity and ropy. Later, the second-generation curvelet transform was demonstrated to be a v fective tool for a variety of applications, including partial differential equations, s data exploration, image processing, and fluid dynamics (PDEs). Periodization wa to treat image borders in earlier iterations of the transform previous version. Th have been properly arranged in this case, and the discrete cosine domain will b instead of the discrete Fourier domain, which is a significant change. A discrete filte structure, called smooth images with piecewise smooth contours, can be handled v tourlets. Structures resembling curvelets in the continuous domain can be coupled discrete transform. As a result, the contourlet transform can be considered a discre sion of a certain curvelet transform. Figure 2 illustrates how curvelet constructio nect to a polar coordinate-based partition of the 2-D frequency plane [38] and ca rotational operation.  Figure 3 compares the effectiveness of the contourlet transform and curvele form in terms of improved descriptors of contour segments. With increasing deco tion levels for both the contourlet and curvelet transform, it can be noticed that the c detection grows smoother as the range of the 32 dimensions increases. The lit claims that the contourlet transform can also offer a more accurate description, r form definitions, contours, and additional directional information [7,38]. The pyr decomposition levels rise along with an increase in the directional sub-bands, an are numerous variable orientations seen in the directional decomposition levels. A component of contourlets, the directional filter bank, has a practical tree structure aliasing is permitted to occur and will be removed by correctly designed filters. B of this, the primary distinction between contourlets and curvelets is that the forme plicitly specified on discrete rectangular grids, which are easier to digitize. Unfortu contourlet functions exhibit more oscillations along the needle-like element curvelets and exhibit less well-defined directional geometry/features. This results facts in denoising and compression.  Figure 3 compares the effectiveness of the contourlet transform and curvelet transform in terms of improved descriptors of contour segments. With increasing decomposition levels for both the contourlet and curvelet transform, it can be noticed that the contour detection grows smoother as the range of the 32 dimensions increases. The literature claims that the contourlet transform can also offer a more accurate description, random form definitions, contours, and additional directional information [7,38]. The pyramidal decomposition levels rise along with an increase in the directional sub-bands, and there are numerous variable orientations seen in the directional decomposition levels. A crucial component of contourlets, the directional filter bank, has a practical tree structure where aliasing is permitted to occur and will be removed by correctly designed filters. Because of this, the primary distinction between contourlets and curvelets is that the former is explicitly specified on discrete rectangular grids, which are easier to digitize. Unfortunately, contourlet functions exhibit more oscillations along the needle-like elements than curvelets and exhibit less well-defined directional geometry/features. This results in artifacts in denoising and compression.

Contourlet and Curvelet Parametric (CP) Image
Rician inverse Gaussian (RiIG) image Eltoft et al. [33] introduced the RiIG distribution, which is a mixture of Rician and inverse Gaussian distributions and is expressed as: where α and β affect the distribution's steepness and skewness respectively; β < 0 indicates a distribution that is skewed to the left, and β > 0 indicates one that is slanted to the right, whereas δ is the dispersion parameter. The value of can be calculated as = √ 2 − 2 . Figure 4 displays a selection of RiIG pdf realizations for different parameter values. It is seen that, with an increase in α and β, the distribution becomes steeper and skews to the right, respectively. On the other hand, as β decreases, it skews to the left. In addition, as is increased, the distribution becomes more dispersed. The RiIG parameter (δ) map

Contourlet and Curvelet Parametric (CP) Image
Rician inverse Gaussian (RiIG) image Eltoft et al. [33] introduced the RiIG distribution, which is a mixture of Rician and inverse Gaussian distributions and is expressed as: where α and β affect the distribution's steepness and skewness respectively; β < 0 indicates a distribution that is skewed to the left, and β > 0 indicates one that is slanted to the right, whereas δ is the dispersion parameter. The value of γ can be calculated as γ = α 2 − β 2 .

Contourlet and Curvelet Parametric (CP) Image
Rician inverse Gaussian (RiIG) image Eltoft et al. [33] introduced the RiIG distribution, which is a mixture of R inverse Gaussian distributions and is expressed as: where α and β affect the distribution's steepness and skewness respectively; β < 0 a distribution that is skewed to the left, and β > 0 indicates one that is slanted to whereas δ is the dispersion parameter. The value of can be calculated as = Figure 4 displays a selection of RiIG pdf realizations for different parameter va It is seen that, with an increase in α and β, the distribution becomes steeper to the right, respectively. On the other hand, as β decreases, it skews to the left. In It is seen that, with an increase in α and β, the distribution becomes steeper and skews to the right, respectively. On the other hand, as β decreases, it skews to the left. In addition, as δ is increased, the distribution becomes more dispersed. The RiIG parameter (δ) map is created by processing the contourlet and curvelet coefficient images via a square sliding window, which results in the contourlet and curvelet parametric (CP) image. For the Nakagami image, parameter (m) mapping and NIG image as well as parameter (α) mapping are considered. This procedure is shown in [22], where the author constructed a Nakagami parametric image using this procedure while computing the image parameters for each image. It should be mentioned that while we created the images in the domains of the contourlet and curvelet transforms, the literature [22,39,40] obtained the parametric images in the spatial domain. According to earlier research, the best sliding window for producing sides that are the parametric image is a square and has a pulse duration that is three times that of the incident ultrasound. In this study, each local RiIG parameter (δ) was examined utilizing the contourlet and curvelet sub-band efficient images with a sliding window of 13 × 13 pixels. The size of the sliding window that is being used should be larger than the speckle and be able to discern different local structure differences in malignancies. The new pixel added to the window's center at each point when the window was moved across the entirety of the contourlet and curvelet sub-band efficient pictures in steps of 1 pixel was designated as the local RiIG parameter (δ). The map of RiIG parameter values produced by this technique is known as the RiIG parametric image. With the relevant figures and a percentile probability plot (pp-plot), the RiIG statistical model is already proven to be preferable to the Nakagami statistical model [34]. In this study, the appropriateness of RiIG statistical modeling over the Nakagami and normal inverse Gaussian (NIG) statistical models is shown in Figure 5 by contourlet and curvelet parametric images and percentile probability plot (pp-plot).
nostics 2023, 13,69 is created by processing the contourlet and curvelet coefficient images via a square window, which results in the contourlet and curvelet parametric (CP) image. For th agami image, parameter ( ) mapping and NIG image as well as parameter ( ) m are considered. This procedure is shown in [22], where the author constructed a Nak parametric image using this procedure while computing the image parameters f image. It should be mentioned that while we created the images in the domains contourlet and curvelet transforms, the literature [22,39,40] obtained the paramet ages in the spatial domain. According to earlier research, the best sliding window f ducing sides that are the parametric image is a square and has a pulse duration three times that of the incident ultrasound. In this study, each local RiIG param was examined utilizing the contourlet and curvelet sub-band efficient images with ing window of 13 × 13 pixels. The size of the sliding window that is being used sho larger than the speckle and be able to discern different local structure differences lignancies. The new pixel added to the window's center at each point when the w was moved across the entirety of the contourlet and curvelet sub-band efficient p in steps of 1 pixel was designated as the local RiIG parameter (δ). The map of RiIG eter values produced by this technique is known as the RiIG parametric image. W relevant figures and a percentile probability plot (pp-plot), the RiIG statistical m already proven to be preferable to the Nakagami statistical model [34]. In this stu appropriateness of RiIG statistical modeling over the Nakagami and normal Gaussian (NIG) statistical models is shown in Figure 5 by contourlet and curvele metric images and percentile probability plot (pp-plot).  Comparing Nakagami, NIG, and RiIG statistical modeling with the aim of classifyin ages with (a) contourlet parametric (CP) images and (b) curvelet parametric (CP) images usin centile probability plots (pp-plots) that show empirical, Nakagami, NIG, and RiIG cumulativ sity functions (CDFs). It can be seen from both pp-plots that the RiIG CDF, as opposed to the agami and NIG CDFs, closely tracks the empirical CDF. Additionally, it shows that for param modeling of breast ultrasound pictures, the RiIG distribution is more appropriate.

Correlated-Weighted Contourlet-or Curvelet-Transformed RiIG (CWCtr-RiIG CWCrv-RiIG) Image
The CP images are linked with the appropriate contourlet and curvelet sub-ban efficient images to produce the CWCtr-RiIG and CWCrv-RiIG images. By executing relation operations with the corresponding contourlet and curvelet sub-bands, the C ages' parameter values are all weighted. The term "correlated-weighted contourl curvelet-transformed RiIG" might be used to describe these images. Figure 6 display transformation from B-Mode image to correlated-weighted parametric imaging at tourlet decomposition level P4D32 and curvelet decomposition level S4A32 with c sponding image pixel value ranges. The transformation is progressing as at first t mode is transformed to a contourlet or curvelet transform coefficient image, then mo by RiIG to obtain a contourlet or curvelet RiIG image. For comparison purposes, WC images (i.e., contourlet or curvelet coefficient are weighted by multiplication with corresponding RiIG image to obtain WCtr-RiIG and WCrv-RiIG images) are also lated. At last, the CWCtr-RiIG and CWCrv-RiIG images are simulated, except weig by correlation rather than multiplication. To reduce the computational time for cons ing CWCtr-RiIG and CWCrv-RiIG images, six sub-bands from the contourlet transf pyramidal decomposition at levels 2, 3, and 4 and the curvelet transforms decompo at scales 2, 3, 4, and 5 are carefully selected as being the most ideal for feature extra where in contourlet, those pyramidal levels contain 8, 16, and 32 directional sub-b and in curvelet, those scale levels contain 16, 32, 32, and 64 angle sub-bands, respect In this study, contourlet directional sub-bands in each pyramidal level and curvelet Figure 5. Comparing Nakagami, NIG, and RiIG statistical modeling with the aim of classifying images with (a) contourlet parametric (CP) images and (b) curvelet parametric (CP) images using percentile probability plots (pp-plots) that show empirical, Nakagami, NIG, and RiIG cumulative density functions (CDFs). It can be seen from both pp-plots that the RiIG CDF, as opposed to the Nakagami and NIG CDFs, closely tracks the empirical CDF. Additionally, it shows that for parametric modeling of breast ultrasound pictures, the RiIG distribution is more appropriate.

Correlated-Weighted Contourlet-or Curvelet-Transformed RiIG (CWCtr-RiIG or CWCrv-RiIG) Image
The CP images are linked with the appropriate contourlet and curvelet sub-band coefficient images to produce the CWCtr-RiIG and CWCrv-RiIG images. By executing correlation operations with the corresponding contourlet and curvelet sub-bands, the CP images' parameter values are all weighted. The term "correlated-weighted contourlet-or curvelet-transformed RiIG" might be used to describe these images. Figure 6 displays the transformation from B-Mode image to correlated-weighted parametric imaging at contourlet decomposition level P4D32 and curvelet decomposition level S4A32 with corresponding image pixel value ranges. The transformation is progressing as at first the B-mode is transformed to a contourlet or curvelet transform coefficient image, then modeled by RiIG to obtain a contourlet or curvelet RiIG image. For comparison purposes, WCP [34] images (i.e., contourlet or curvelet coefficient are weighted by multiplication with their corresponding RiIG image to obtain WCtr-RiIG and WCrv-RiIG images) are also simulated. At last, the CWCtr-RiIG and CWCrv-RiIG images are simulated, except weighted by correlation rather than multiplication. To reduce the computational time for constructing CWCtr-RiIG and CWCrv-RiIG images, six sub-bands from the contourlet transform's pyramidal decomposition at levels 2, 3, and 4 and the curvelet transforms decomposition at scales 2, 3, 4, and 5 are carefully selected as being the most ideal for feature extraction, where in contourlet, those pyramidal levels contain 8, 16, and 32 directional sub-bands, and in curvelet, those scale levels contain 16, 32, 32, and 64 angle sub-bands, respectively. In this study, contourlet directional sub-bands in each pyramidal level and curvelet angle sub-bands in each scale with larger sizes are taken into consideration because they predicted the best results than the other sub-bands. Therefore, the chosen sub-bands for contourlet analysis are pyramidal level-2 directional level-4 (P2D4), as well as P2D8, P3D8, P3D16, P4D16, and P4D32, which are shown in Figure 7i. For the curvelet domain, the chosen sub-bands are scale-2 angle-16 (S2A16), as well as S3A32, S4A32, S5A16, S5A32, and S5A64, which are shown in Figure 7ii. As previously stated, the primary rationale for choosing these sub-bands is that they offer the maximum resolution for the images, which is crucial for the classification process.
Diagnostics 2023, 13, 69 10 of 21 and S5A64, which are shown in Figure 7ii. As previously stated, the primary rationale for choosing these sub-bands is that they offer the maximum resolution for the images, which is crucial for the classification process.  and S5A64, which are shown in Figure 7ii. As previously stated, the primary rationale for choosing these sub-bands is that they offer the maximum resolution for the images, which is crucial for the classification process.

Proposed Classification Schemes
The proposed correlated-weighted statistically modeled contourlet and curvelet coefficient image-based classification schemes are illustrated in Figure 8. To inspect the performance of the deep CNN fully connected classifier and three machine learning classifiers, namely SVM, KNN, and random forest, are considered in this study. All of the classifiers used in this study were created in MATLAB (default parameters and the toolbox). In both, the deep CNN-based classification scheme and machine learning classification scheme, the correlated-weighted contourlet-transformed RiIG (CWCtr-RiIG) with 224 × 224 × 6 dimension stack images are applied as the input. For training, neural networks frequently need many more samples than the 250 images from Database-I, 163 images from Database-II, and 647 images from Database-III. To create three huge databases with a combined total of 6000 images, by augmentation, the sample count was raised to 2000 for each of the three databases, with an equal proportion of malignant and benign instances. Each B-mode image has six sub-bands, increasing the total number of images to 6000 × 6 = 36,000 contourlet coefficient and 36,000 curvelet coefficient images. As any form of scaling or rotation would likewise eliminate the features dependent on size or orientation, on the base images, only the translational augmentation of 1 to 11 pixels in both directions is carried out. The overall process is implemented by "imageDataAugmenter" MATLAB function. Figure 7 makes it clear that the images produced by various curvelet and contourlet sub-band coefficients all have distinct sizes. All of the images are enlarged to 224 × 224 because a CNN requires all of the images to be the same size. Then, 6000 stack images are produced by stacking the appropriate six sub-band images that are 224 × 224 × 6 in size. The CNN network employed in this work is inspired by the custom CNN network provided in [34] and has 375,500 parameters with weighted contourlet parametric (WCP) images. The differences between that scheme and our proposed scheme are that the proposed deep CNN architecture has 316,400 parameters employing 224 × 224 × 6

Proposed Classification Schemes
The proposed correlated-weighted statistically modeled contourlet and curvelet coefficient image-based classification schemes are illustrated in Figure 8. To inspect the performance of the deep CNN fully connected classifier and three machine learning classifiers, namely SVM, KNN, and random forest, are considered in this study. All of the classifiers used in this study were created in MATLAB (default parameters and the toolbox). In both, the deep CNN-based classification scheme and machine learning classification scheme, the correlated-weighted contourlet-transformed RiIG (CWCtr-RiIG) with 224 × 224 × 6 dimension stack images are applied as the input. For training, neural networks frequently need many more samples than the 250 images from Database-I, 163 images from Database-II, and 647 images from Database-III. To create three huge databases with a combined total of 6000 images, by augmentation, the sample count was raised to 2000 for each of the three databases, with an equal proportion of malignant and benign instances. Each B-mode image has six sub-bands, increasing the total number of images to 6000 × 6 = 36,000 contourlet coefficient and 36,000 curvelet coefficient images. As any form of scaling or rotation would likewise eliminate the features dependent on size or orientation, on the base images, only the translational augmentation of 1 to 11 pixels in both directions is carried out. The overall process is implemented by "imageDataAugmenter" MATLAB function. Figure 7 makes it clear that the images produced by various curvelet and contourlet sub-band coefficients all have distinct sizes. All of the images are enlarged to 224 × 224 because a CNN requires all of the images to be the same size. Then, 6000 stack images are produced by stacking the appropriate six sub-band images that are 224 × 224 × 6 in size. The CNN network employed in this work is inspired by the custom CNN network provided in [34] and has 375,500 parameters with weighted contourlet parametric (WCP) images. The differences between that scheme and our proposed scheme are that the proposed deep CNN architecture has 316,400 parameters employing 224 × 224 × 6 stack CWCtr-RiIG or CWCrv-RiIG images as input in two different multi-resolution transform domains, such as contourlet and curvelet transform domains, respectively. In the deep CNN-based approach, an activation function is generated by combining SoftMax and sigmoid functions, and in the machine learning-based approach, features are taken out of the deepest CNN outer layer (the Global Average Pooling layer), and those are applied to three different machine learning classifiers such as SVM, KNN and Random Forest. The proposed network is also tested with WCP images which are constructed by multiplication [34]. The suitability of CWCtr-RiIG and CWCrv-RiIG images over the WCP image is shown in Figure 6 and Table 2. It is observed that the CWCtr-RiIG and CWCrv-RiIG images have less training time in the same proposed deep CNN network than the WCP image. Moreover, the WCP image has pixel values from 40 to 255 in. stack CWCtr-RiIG or CWCrv-RiIG images as input in two different multi-resolution transform domains, such as contourlet and curvelet transform domains, respectively. In the deep CNN-based approach, an activation function is generated by combining SoftMax and sigmoid functions, and in the machine learning-based approach, features are taken out of the deepest CNN outer layer (the Global Average Pooling layer), and those are applied to three different machine learning classifiers such as SVM, KNN and Random Forest. The proposed network is also tested with WCP images which are constructed by multiplication [34]. The suitability of CWCtr-RiIG and CWCrv-RiIG images over the WCP image is shown in Figure 6 and Table 2. It is observed that the CWCtr-RiIG and CWCrv-RiIG images have less training time in the same proposed deep CNN network than the WCP image. Moreover, the WCP image has pixel values from 40 to 255 in.  WCP or WCtr-CWCtr-RiIG or  Regarding contourlet transform domain and curvelet transform domain, when applied to the CNN network, those 0-to-255-pixel value images are normalized and variations will be less because the pixel values such as 255 and higher than 200 will be converted to pixel value 1, thus having fewer variations. On the other hand, the CWCtr-RiIG and CWCrv-RiIG image pixel values are −1 to 1, having more variations when normalized in a deep CNN network, which will be an impact on feature extraction. Table 3 shows the suggested deep CNN network configuration's architecture. To ensure that the testing and training samples are completely separate, a ratio of 90% to 10% is employed for training, with 10% of the unaugment database photos and their matching augmented images randomly chosen for testing, and the remaining 90% is used for training. The accuracy can be significantly biased and higher than the genuine test if the test data and training data coincide. The hyperparameters of the neural network are chosen using a statistic called the average validation accuracy together with a tenfold cross-validation scheme and an exhaustive grid search approach. The batch size and learning rate for this network are 60 and 0.01, respectively, along with the Adam optimization algorithm [41]. Through 4000 cycles, the CNN network is used to apply the training data. Utilizing accuracy, sensitivity, specificity, PPV, NPV, and other performance indicators, the proposed technique's performance is evaluated. Once the TP, TN, FP, and FN signals have been measured, the confusion matrices have been constructed. True positive (TP) signals denote a malignant tumor and true negative (TN) signals, a benign tumor. Section 3 of the report discusses the conclusions.

Results
The suggested classification schemes evaluate the classification performance on correlatedweighted parametric versions of contoured and curvelet-transformed images for both schemes. The findings are displayed in Table 4, where it is clear that the use of statistical modeling on the contourlet and curvelet transforms increases classification accuracy. Here, Database-I, -II, and -III all have the highest levels of accuracies by SVM classifier of 97.05%, 97.35%, and 98%; by KNN classifier, 97.85%, 98.05%, and 98.25%; by random forest classifier, 98.15%, 98.40%, and 98.85%; and by deep CNN classifier 98.25%, 98.45%, and 98.95%, respectively. RiIG modeling's appropriateness on B-mode images instead of Nakagami, Gaussian, and normal inverse Gaussian (NIG) statistical models were already depicted in a few works in the earlier literature [33,34]. In this study, the accuracy is also compared with Nakagami and NIG statistically modeled as CWCrv-Nakagami, CWCtr-Nakagami, CWCrv-NIG, and CWCtr-NIG images along with the RiIG modeled correlated-weighted images, and it is observed that RiIG is highly suitable for correlated-weighted transform domain parametric images in breast tumor classification. From the results, it is seen that the deep CNN classifier has the best result in the classification performance. A new activation function is applied here combining SoftMax and sigmoid activation functions. SoftMax function provides the softened maximum probability in multiclass classification. The correlated-weighted contourlet-or curvelet-transformed RiIG images have pixel values −1 to 1. In a few cases, the maximum probability of two classes (i.e., benign and malignant) appaired the same. By adding the sigmoid activation function, which provides a hard decision (e.g., benign or malignant), discrimination in such classes would be possible. The SoftMax function is given by σ z is the input vector, e z i is a standard exponential function for the input vector, and e z j is a standard exponential function for the output vector with the multi-class classifier, having k classes in total. For multiclass classification SoftMax, the activation function is a better choice. In this paper, by using only the SoftMax function, the accuracy, sensitivity (true positive rate), and specificity (true negative rate) are attained at 98.95%, 99.19%, and 98.71%, respectively, with an F1 score of 0.989. Another activation function having nonlinear boundary decision is the sigmoid function defined as σ where e is Euler's number. Combining the SoftMax function with the sigmoid function a new activation function is generated, which can be defined as: Applying this combined activation function, the accuracy, sensitivity (true positive rate), and specificity (true negative rate) is attained at 98.95%, 98.9%, and 99%, respectively, with an F1 score of 0.99, which means that although the accuracy is not changed, the F1 score slightly increases. In the deep learning-based classification task, each value after a point with a high accuracy (such as 98%) is significant. Table 4 makes it clear that RiIG was more appropriate for the B-mode statistically modeled images, as it proved to be more effective than the Nakagami and NIG statistical models for all four classifiers in Database-I, -II, and -III. Additionally, the findings showed that a deep CNN-based classification scheme with a fully connected classifier provided better accuracy than other machine learning classifiers. In the SVM machine learning classifier-based approach, the highest accuracy is obtained in Database-III, where the accuracy, sensitivity, specificity, PPV, and NPV are 98%, 98.19%, 97.81%, 97.80%, and 98.20%, respectively. In the case of KNN, the highest performance is obtained in Database-III, where the accuracy, sensitivity, specificity, PPV, and NPV are 98.25%, 98.01%, 98.49%, 98.50%, and 98%, respectively. In the case of random forest, the best performance is also obtained in Database-III, where the accuracy, sensitivity, specificity, PPV, and NPV are 98.85%, 98.99%, 98.7%, 98.7%, and 99%, respectively. For the deep CNN fully connected classifier the overall best performance is obtained regarding the accuracy, sensitivity, specificity, PPV and NPV are 98.95%, 98.9%, 99%, 99%, and 98.9%, respectively. The suggested RiIG-based CWCtr-RiIG images are the best option for the categorization of breast tumors in both the deep CNN fully connected classifier-based approach and the machine learning classifier-based approach, as shown in Table 4. The confusion matrices in terms of best performance with Database-III, shown in Figure 9, display the suggested methods using the deep CNN, SVM, KNN, and random forest classifiers along with performance indices such as accuracy, sensitivity, and specificity as well as PPV and NPV, involving measuring malignant tumors as true positives (TP), benign tumors as true negatives (TN), false positives (FP), and false-negatives (FN), respectively. The greatest values of accuracy, sensitivity, specificity, PPV, and NPV for Database-III, utilizing both categorization systems, are seen to be greater than 98%.  Table 4. The confusion matrices in terms of best performance with Database-III, shown in Figure 9, display the suggested methods using the deep CNN, SVM, KNN, and random forest classifiers along with performance indices such as accuracy, sensitivity, and specificity as well as PPV and NPV, involving measuring malignant tumors as true positives (TP), benign tumors as true negatives (TN), false positives (FP), and false-negatives (FN), respectively. The greatest values of accuracy, sensitivity, specificity, PPV, and NPV for Database-III, utilizing both categorization systems, are seen to be greater than 98%.

Discussion
The best classification accuracy is demonstrated in the previous section using the deep CNN classifier with the RiIG-based CWCtr-RiIG pictures. Table 5 offers a comparison with comparable works. Using the same Database-I, P. Acevedo et al. [5] claimed a 94% accuracy, with a 0.942 F1 score, while Karthiga et al. [20] reported 94.5% accuracy, with a 0.945 F1 score. In light of this, the highest level of accuracy made possible by the proposed method employing the same Database-I is roughly 98.30% and has an F1 score

Discussion
The best classification accuracy is demonstrated in the previous section using the deep CNN classifier with the RiIG-based CWCtr-RiIG pictures. Table 5 offers a comparison with comparable works. Using the same Database-I, P. Acevedo et al. [5] claimed a 94% accuracy, with a 0.942 F1 score, while Karthiga et al. [20] reported 94.5% accuracy, with a 0.945 F1 score. In light of this, the highest level of accuracy made possible by the proposed method employing the same Database-I is roughly 98.30% and has an F1 score of 0.983, which is noticeably better. Hou et al. [29] employed Database-II in a different study and reported a 94.8% accuracy rate. Combining the same Database-II with additional databases, Shin et al. [30] found an accuracy of 84.5%. According to Byra et al. [31], utilizing Database-II, their accuracy was 85.3%, and their F1 score was 0.765. With an F1 score of 0.942 with Database-II, Qi et al. [32] demonstrated an accuracy of 94.48%. On the other hand, the suggested approach employing Database-II offers the highest accuracy, 98.45%, and an F1 score of 0.985. With Database-I, Kabir et al. [34]'s accuracy was 98.25%, and their F1 score was 0.982; for Database-II, it was 98.35%, and their F1 score was 0.984; and for Database-III, it was 98.55%, and their F1 score was 0.986. The approach of Ka Wing Wan et al. [42] yields accuracy for Database-III of 90% using a random forest classifier and 91% using a CNN, with an F1 score of 0.83. The identical Database-III was used by Moon et al. [43], who reported 94.62% accuracy, with a 0.911 F1 score. The accuracy and F1 score of the suggested method, in comparison, are superior, with the greatest accuracy being roughly 98.95% and an F1 score of 0.99. Additionally, using the same validation strategy as in [42,43], the suggested dual input CWCtr-RiIG image-based deep CNN technique is deployed for classification on Database-III, with an 80% training to 20% testing ratio. With an F1 score of 0.98, this experiment's accuracy, sensitivity, and specificity ratings are still better than those of [42,43]. The box plots in Figure 10 show a comparison of the accuracy of Table 5, and they also show that the proposed method performs consistently with other approaches. As mentioned earlier, the images in Database-I have undergone speckle reduction, compressed dynamic range, and persistence pre-processes. For Database-II and -III, the images have undergone edge enhancement, speckle reduction, and persistence only. Due to the heavily compressed dynamic range, the resultant accuracy using Database-I is lower than that of Database-II and -III. Moreover, if we combined the images of Database-II and -III, the classification accuracy attained 98.4%, while by combining the images of the three databases, the classification accuracy fell to 97.15%. Therefore, it seems that an automated edge enhancement process may further improve the performance in the case of Database-I. However, incorporating an edge enhancement technique will increase the complexity of the method. It is an interesting area of future exploration to develop a novel neural network architecture that can deliver a high degree of accuracy even with heavily compressed dynamic range images by additional pre-processing such as edge enhancement. roughly 98.95% and an F1 score of 0.99. Additionally, using the same validation strategy as in [42,43], the suggested dual input CWCtr-RiIG image-based deep CNN technique is deployed for classification on Database-III, with an 80% training to 20% testing ratio. With an F1 score of 0.98, this experiment's accuracy, sensitivity, and specificity ratings are still better than those of [42,43]. The box plots in Figure 10 show a comparison of the accuracy of Table 5, and they also show that the proposed method performs consistently with other approaches. As mentioned earlier, the images in Database-I have undergone speckle reduction, compressed dynamic range, and persistence pre-processes. For Database-II and -III, the images have undergone edge enhancement, speckle reduction, and persistence only. Due to the heavily compressed dynamic range, the resultant accuracy using Database-I is lower than that of Database-II and -III. Moreover, if we combined the images of Database-II and -III, the classification accuracy attained 98.4%, while by combining the images of the three databases, the classification accuracy fell to 97.15%. Therefore, it seems that an automated edge enhancement process may further improve the performance in the case of Database-I. However, incorporating an edge enhancement technique will increase the complexity of the method. It is an interesting area of future exploration to develop a novel neural network architecture that can deliver a high degree of accuracy even with heavily compressed dynamic range images by additional pre-processing such as edge enhancement.   Table 5.

Conclusions
In this paper, a novel approach to breast tumor classification is presented, employing RiIG statistically modeled correlated-weighted contourlet-and curvelet-transformed RiIG images in a deep CNN architecture. In the first approach, the RiIG statistically modeled CWCtr-RiIG and CWCrv-RiIG images are classified by deep CNN fully connected classifiers. In the second approach, the RiIG statistically modeled CWCtr-RiIG and CWCrv-RiIG images are classified by deep CNN-SVM, KNN, and random forest machine learning classifiers. It is demonstrated that a high level of accuracy can be attained by using the deep CNN fully connected classifier. Second, a brand new, specially created deep CNN architecture is suggested for classifying CWCtr-RiIG and CWCrv-RiIG images of breast tumors since it performs more accurately. Additionally, the suggested deep CNN design can use the loss function to provide extremely high levels of sensitivity, specificity, NPV, and PPV values, combining SoftMax and sigmoid activation functions. On benchmark publicly available datasets, both algorithms show superior classification performance to the state-of-the-art techniques. Additionally, the RiIG distribution is a distribution that is very well-suited for simulating the characteristics of the contourlet and curvelet transform coefficients of breast tumor images obtained in B-mode ultrasound. By applying the transformer model-based approach and including additional datasets, there is room for improvement.