Breast Lesions Screening of Mammographic Images with 2D Spatial and 1D Convolutional Neural Network-Based Classiﬁer

: Mammography is a ﬁrst-line imaging examination that employs low-dose X-rays to rapidly screen breast tumors, cysts, and calciﬁcations. This study proposes a two-dimensional (2D) spatial and one-dimensional (1D) convolutional neural network (CNN) to early detect possible breast lesions (tumors) to reduce patients’ mortality rates and to develop a classiﬁer for use in mammographic images on regions of interest where breast lesions (tumors) may likely occur. The 2D spatial fractional-order convolutional processes are used to strengthen and sharpen the lesions’ features, denoise, and improve the feature extraction processes. Then, an automatic extraction task is performed using a speciﬁc bounding box to sequentially pick out feature patterns from each mammographic image. The multi-round 1D kernel convolutional processes can also strengthen and denoise 1D feature signals and assist in the identiﬁcation of the differentiation levels of normality and abnormality signals. In the classiﬁcation layer, a gray relational analysis-based classiﬁer is used to screen the possible lesions, including normal (Nor), benign (B), and malignant (M) classes. The classiﬁer development for clinical applications can reduce classiﬁer’s training time, computational complexity level, computational time, and achieve a more accurate rate for meeting clinical/medical purpose. Mammographic images were selected from the mammographic image analysis society image database for experimental tests on breast lesions screening and K-fold cross-validations were performed. The experimental results showed promising performance in quantifying the classiﬁer’s outcome for medical purpose evaluation in terms of recall (%), precision (%), accuracy (%), and F1 score.


Introduction
According to the 2021 International Agency for Research on Cancer study report and the statistics provided by Taiwan's Ministry of Health and Welfare (MOHW), breast cancer in women is first place among women worldwide and women in Taiwan [1,2], respectively, with approximately 2.3 million women diagnosed with breast cancer worldwide and approximately 14,217 women with breast cancer in Taiwan, which means one woman has breast cancer every 37 min. Hence, early detection of possible breast lesions can reduce patients' mortality rates, which can not only improve survival rates, but also have a more effective treatment [3]. Routine first-line mammography examination can help to detect any possible breast lesions on the right or left breast. An automatic screening tool with digital image processing and artificial intelligence (AI) algorithms will assist clinicians and radiologists in preliminary diagnosis, will solve the problem of insufficient human resources, and allows clinicians to focus on follow-up medical strategies. Hence, in this study, we intend to develop machine-vision-based medical tools, such as machine learning ML-and DL-based methods, such as image enhancement, image segmentation, feature extraction, and image classification, have been applied for breast tumor screening to increase speed in detection tasks and also to decrease humans' manual errors. Hence, using AI-based methods can help clinicians or radiologists to easily detect the breast lesions and to make better diagnostic decisions for treatment plans. Image examinations, such as mammography, magnetic resonance imaging (MRI), and computed tomography (CT) image, are widely used to detect the regions of breast lesions and microcalcification in internal and external right/left breast. MRI is costly, and it requires a contrast agent to enhance the images to improve the diagnostic results. However, this contrast agent's dosage level, such as mild or severe level, may affect the patient [6,13,14]. Additionally, MRI may be a promising manner in screening younger women with dense breasts at a higher risk level for developing breast cancer. CT imaging offers clearer and better resolution images for hard objects than for softer tissue, and may be an assistive tool for monitoring tumor spread. However, radiation is a major concern in CT imaging. X-ray mammography is a first-line imaging examination used to screen the breast lesions in women who have no symptoms of breast cancer. In mediolateral oblique views of mammography images, the focal, linear, segmental, regional, and multiple regional mass distributions indicate the higher probability of malignancy [15]. Hence, morphological patterns are key features in identifying the malignant and benign masses, as shown in Figure 1, which refers to the morphological descriptors of the Breast Imaging-Reporting and Data System (BI-RADS). The seven assessment categories are used to characterize lesions [11,16,17], and the assessment of BI-RADS categories for mammogram classification is shown in Table 1. Table 1. Assessment of BI-RADS categories for mammogram classification [11,16,17]. For mammography databases, such as Suspicious Regions on Mammograms from Palermo Polyclinic (343 images, manually annotated by experts) [18], the Digital Database of Screening Mammography (DDSM) (2620 studies from hospitals and medical universities) [19], the Curated Breast Imaging Subset of the Digital Database for Screening Mammography (modified and standardized version of DDSM) [20], Image Retrieval in Medical Applications (11,000 X-ray images, dataset with region of interest (ROI) annotations) [21], and the Mammographic Image Analysis Society (MIAS) (322 screening mammograms) [22,23], the clinical information was annotated by expert radiologists, including image size, background tissue types, class and severity of abnormality (breast lesions), and coordinates of center of abnormalities. Hence, in an ROI, researchers can easily extract feature patterns ( Figure 1) as training datasets to train the supervised ML-and DL-based classifier for mass/suspicious lesion segmentation, mass detection/classification, and abnormality detection, such as a support vector machine (SVM), artificial neural network (ANN), k-nearest neighbor classifier, or deep multilayer convolutional neural network (CNN) [6,[24][25][26][27][28][29]. They can be carried out on structured data for binary, multiclass, and multi-label classification applications [30]. The above-mentioned supervised learning methods can be used to train a classifier model with labeled feature patterns for breast lesion detection. However, traditional ML-based methods, which consist of an input layer, one or more hidden layers, and an output layer, lack the feature enhancement and feature extraction functions. Highdimensional data processing is a major concern. The preprocesses of feature selection and feature extraction are used to reduce the data dimensionality for overcoming this drawback. The deep CNN-based methods combine the multiconvolutional pooling layers (>10 layers in general configuration) and a classification layer to perform the automatic end-to-end enhancement process, noise filtering, feature extraction, and pattern recognition in this proposed topic, such as amass classification, lesion detection and localization, and lesion segmentation/ROI detection, by using fully convolutional network (FCN), Unet CNN, region-based CNN (R-CNN), faster R-CNN, TTCNN (transferable texture convolutional neural network), and Grad-CAM (gradient-weighted class activation mapping)-based CNN [31][32][33][34][35][36][37][38]. The multiconvolutional-pooling processes can extract the desired features from low-level features to high-level information (sharpening process) for detecting nor-mality objects, and then can increase the detection accuracy. However, these configurations will result in increasing the complexity level and computational time and increases the data dimensionality and the volume of training datasets for leading to address overfitting problems during the training of a classifier. Moreover, when classifying the mammography dataset, maximum pooling (MP) is usually used because of the near background for normality, and we are interested in the lighter region for abnormality extraction. The MP will select the brighter pixel values from the image in a specific pooling mask; hence, the dimension of the feature patterns can be effectively reduced, thereby overcoming the overfitting problems in training tasks with too much training datasets [39][40][41]. The ROI is proved using the Grad-CAM, which also replaces the conventional lighting with a fully linked layer that uses global average pooling (GAP) [38]. The feature patterns are then obtained by activating the rectified linear unit (ReLU), utilizing the summation and multiplication of feature patterns, respectively, using the GAP. [42]. Therefore, the classification accuracy of these multilayer structures can be improved for digital image classification. However, the limitations of the multilayer classifier are determining the number of convolutional-pooling layers, the number of convolution kernels, and the sizes of convolutional masks for setting the structure of convolutional-pooling layers. Moreover, too many multiconvolutional-pooling processes will result in spatial and edge information loss, and have no use for the key feature extraction.
Hence, we intend to design a 2D spatial and 1D CNN-based classifier to simplify the tasks of image enhancement, feature extraction, and pattern recognition, comprising a twodimensional (2D) spatial convolutional layer, a flattening process layer, one-dimensional (1D) convolutional layers, a pooling process layer, and a classification layer, which are integrated into an individual multilayer classifier for breast lesions screening. In the 2D spatial convolutional layer, fractional-order-based convolutional, Grad-CAM activation mapping, integral image (II) operations [41][42][43][44][45] can be employed to perform the convolutional processes to detect the desired object's edge and contour in the specific region along the horizontal and vertical directions. The different feature patterns can be extracted through convolutional operations by using different filtering mask weights and mask size assignments, such as 3 × 3, 5 × 5, 7 × 7, 9 × 9, 11 × 11, and so on. Hence, these extracted feature patterns can be used to employ studies of mammographic classification and to identify breast lesions. However, the fractional order-based masks require selecting suitable fractional order parameters (v ∈ (0, 1)) and sizes of convolutional masks [41,46,47] to extract different aspects (horizontal, vertical, or diagonal edges) and useful features from the input images. The II process performs the spatial convolutions by using the summed area table (SAT) [43][44][45] to detect the line and diagonal edge features. The II-based convolutional process does not require the convolutional mask's parameters and sizes. Therefore, after the 2D spatial convolutional process, via image enhancement (to adjust the contrast and maintain the features), the possible lesion can be easily detected and located in a ROI with a specific bounding box, and the "feature pattern" can be easily picked out from the original mammographic image. Then, converting a 2D feature pattern into a "1D feature vector" by the flattening process, the multi-round 1D convolutional processes subsequently enhance the incoming feature vector as feature signals, which can also increase the significant characteristics for further feature extraction and classification applications.
The proposed 1D convolutional operations use the simpler linear weighted mathematical sums to deal with the incoming subsequent feature signals and can remove unwanted noises. Additionally, the 1D kernel convolutional process can quantify the difference levels in feature signals for separating Nor from B and M classes. In real-time application, this simple architectural can be easy to implement for the intended medical purpose. In the classification layer, feed the 1D feature pattern into the input layer of gray relation analysis (GRA)-based classifier [48,49], and the mammographic classification of breast lesions can be identified, including Nor, B, and M classes. In experimental validations, mammographic images were collected from the MIAS database [22,23], including training datasets and testing datasets for training the classifier and verifying the classifier performances in clinical applications. With the K-fold cross-validation, the experimental results showed the promising classifier's performances for automatic breast lesions screening, with precision (%), recall (%), accuracy (%), and F1 score indices [39][40][41].

Mammographic Images Collection
We collected the digital mammographic images from the MIAS image database (v1. 21,2015), including the original 322 images (161 pairs, including right and left images) at a spatial resolution of 50 µm 2 pixel edge with a linear response in the optical density range of 0.0−3.2 at 8 bits/pixel in a portable gray map format [22,23]. Overall image filenames consisted of three-digit serial numbers, l or r for left and right breast, respectively, and s, m, l, and x denoted the image sizes. The most common image size was 4320 pixels × 2600 pixels, which was selected for the proposed study in breast tumor screening. The clinical information was confirmed by expert radiologists for biomarkers, such as image size, image category, background tissue (fatty, fatty glandular, and dense glandular), class of abnormality (calcification, masses, asymmetry, architectural distortion, and Nor), severity of abnormality (B and M classes), and location of center of abnormality [22,23]. A total of 59 subjects (35 normal subjects and 24 abnormal subjects) with 118 mammographic images (59 pairs, including right and left images) were selected for experimental verification. According to the abnormality location, the ROI of each image could be extracted with a 100 × 100 bounding box, around the center, and a total of 500 feature patterns (200 Nors, 150 Bs, and 150 Ms) could be extracted from the 118 images, as seen in the feature templates in Figure 1, which are available for further training and validating the classifier at learning and recalling stages.

Integral Image (II)-Based Convolutional Process
The II-based convolutional process can rapidly evaluate and calculate the summations in a specific rectangular region (as seen in the summed area table (SAT) [43][44][45] in Figure 2), which performs the convolutional process irrespective of the convolutional mask sizes. Its process facilitates summation of pixels over axis-aligned rectangular regions in constant time, regardless of the neighborhood size. For a discrete image I at the pixel (x, y), its II, IntegI (•), is defined as the sum of the pixel values of I(x, y) of the upright rectangle ranging from the location (0, 0) (top left corner) to the location (x, y) (bottom right corner), as the expression form [43][44][45]: IntegI(x, y) = ∑ x ≤x,y ≤y I(x , y ), x = 0, 1, 2, . . . , x, y = 0, 1, 2, . . . , y where the summed area is a rectangle region with four array references, and as seen in the summed area table (SAT) in Figure 2, the 2 × 2 II image can be computed in parallel as follows: where input pixel I(x, y) can be computed in a single pass over the image, as follows: Hence, with four values from the SAT, we can simply compute the II's value by using Equation (3), where IntegI(x, −1) = IntegI(−1, y) = 0. An II-based convolutional process can rapidly compute the summations over the image's subregions, as a linear combination of four pixels. Hence, for a matrix I of size n × n, each column and each row need (n − 1) operations of addition in both the column-wise and row-wise prefix sums, which can reduce the computational processes (takes ≈ 2n (n − 1) additions) and computational complexity level at the convolutional layer using the SAT.
For example, Figure 3b,c indicate the preliminarily results of 2D spatial convolutional processes by using the fractional-order-based convolutional and II-based convolutional operations, respectively. It can be seen that the 2D spatial convolution process can enhance the edge information while the gray-level values are significantly changed. Their spatial convolutional processes act as a low-pass frequency filter [50] and then remove the highspatial-frequency components, which can enhance the contour of a possible abnormal object and retain non-characteristic information. Hence, the desired object can be easily localized within the specific region. Then, the ROI can be identified in a 2D mammographic image (Figure 3a,c) for further feature extraction in right or left breast, as seen in Figure 3. Hence, we can locate the possible lesion and then automatically locate the ROI and extract the feature patterns from an enhanced mammographic image with a n × n bounding box (n = 100 in this study). The fractional-order-based convolution has promising denoising and sharpening capabilities; however, its convolutional mask needs to select the appropriate mask sizes and fractional-order parameters, v ∈ (0, 1). The SAT enables the rapid calculation of the sum of pixel values ( Figure 2) for II-based convolutional processes in arbitrarily sized and axis-aligned rectangle sections, allowing the implementation of real-time computations and a reduction in the computational complexity level.

Feature Extraction with Multi-Round 1D Convolutional Processes and 1D Pooling Process
After image enhancement with the II-based convolutional process, we can extract the feature patterns from any mammographic image by using a 100 × 100 bounding box, as seen in the red block in Figure 4. In the feature extraction layer, we first perform the normalization and flattening processes with a FLAT operator to convert a matrix form (100 × 100) to a vector form (1 × 10,000), which is presented as follows: where I xy (p, q) are the II values within the ROI, p = 1, 2, 3, . . . , n, and q = 1, 2, 3, . . . , n (n = 100 in this study); FLAT (•) is the flattening operator; and FLATI x is the 1D data stream vector as a 1D feature signal. Multi-round 1D convolutional processes are used to deal with the incoming feature signal by using the convolutional operations, , and a discrete-time form [39,40,51] of convolutional operation is presented as follows: where vector X c [i] is the feature signal in the cth convolutional operation, index c = 1, 2, . . . , is the discrete kernel mask with the sliding stride = 1 for feature signal process, which could be used to deal with cth round convolutional operation and sampling data, j = 0, 1, 2, 3, . . . , M c − 1, and index, M c , is the data length of the 1D kernel mask (M c = 200 is default). In this study, we set the two-round 1D convolutional processes (C th = 2) to deal with the feature signals, as seen in the flowchart in Figure 5.  After 1D convolutional processing, the 1D pooling process (downsampling) and normalization process can reduce the dimension of feature signals, which is presented as follows: x where operator POOL (•) is the 1D pooling process; vector x[i] is the downsampling 1D feature signal, which is obtained by using the stride = 100; and operator max (X c ) can find the maximum value in vector X c . As shown in Figure 5, the convolutional-pooling layer performs the normalization and flattening processes ( 2 ), two-round 1D convolutional processes ( 3 and 4 ), and 1D pooling and normalization processes ( 5 ) to obtain the 1D feature signal and filter the noise, which can obtain the stable feature parameters for identifying different levels. Hence, the above-mentioned three processes can be combined into a feature extraction function for preliminarily separating the normal (Nor) from the abnormality (B and M classes), as seen in the downsampling 1D feature signals in Figure 6.

Breast Lesions Screening with a GRA-Based Classifier
As shown in Figure 4, in the classification layer, a GRA-based classifier is a fully connected multilayer network, consisting of an input layer, GRA-based layer, summation layer, and output layer. Its function is based on the similarity level to automatically identify possibilities, including Nor, B, and M classes. Its pattern recognition scheme uses Gaussian-based gray indicators to separate the normality (Nor) level from the two abnormality levels, which perform the pattern recognition tasks by using straightforward mathematical operations without optimization/fine-tuning algorithms, hyperparameter assignments, and iteration computations [40,48,49]. The Gaussian functions are used to measure the similarity level between a testing dataset, x 0 , and training datasets, x k , which are represented as x 0 = [x 1 (0), x 2 (0), x 3 (0), . . . , x i (0), . . . , x 100 (0)] and x k = [x 1 (k), x 2 (k), x 3 (k), . . . , x i (k), . . . , x 100 (k)], k = 1, 2, 3, . . . , K, respectively. The similarity level can be measured by the Euclidean distance (ED) where parameter, d i (k), is the difference between a testing dataset and K training datasets; K is the number of training datasets. The function of the gray grade, g(k), can be defined as follows [49,50]: where σ is the standard deviation which can be automatically determined by the term "(∆d max − ∆d min )"; ∆d max and ∆d min are the maximum and minimum difference values, respectively; and K comparative data are created by training datasets, x k , including (1) Nor class, (2) B class, (3) and M class. Then, the GRA-based classifier's output can be normalized as follows: where parameters, w kj , are the connected weighting values as the desired class referring to the input feature signal between the GRA-based layer and summation layer, which can be set by K × 3 (m = 3, three classes in this study) output training data, as encoded by value "1" or value "0"; the output pattern vector for three classes can be encoded as can be decided by the threshold value at "value 0.5" to identify the disease present (for value 1) or disease absent (for value 0). Hence, we can perform our medical purpose to establish a classifier for automatic multi-label classification, consisting of a II-based convolutional process in the 1st convolutional layer for image enhancement; two 1D convolutional processes in the 2nd and 3rd convolutional layers (with discrete Gaussian mask, data length = 200, and stride = 1); 1D pooling layer (stride = 100) for feature extraction; and GRA-based classifier in the classification layer, as seen in the summary of the proposed model in Table 2. As seen in Figure 7, the flowchart of the classifier's testing and validation includes the image enhancement and noise denoising with II-based spatial convolutional process, feature pattern extraction, flattening process, two-round 1D convolutional process, 1D pooling process, breast lesions screening, and keeping its medical purpose in clinical application.

Experimental Setup
This study would compare the proposed classifier with the traditional 2D CNNbased classifier, including training time, accuracy, and classifier performances. As seen in Table 2, we also established the two multilayer classifiers by using different numbers of convolutional-pooling layers, different types of convolutional masks, and different sizes of convolutional masks. We adopted a 2D spatial convolutional layer, two convolutionalpooling layers, a flattening layer, and a classification layer [41,52]. Two 3 × 3 fractional-order masks were used to perform the 2D spatial convolutional processes for enhancing the edge information of the possible breast lesions; the fractional-order parameter, v = 0.30-0.40, provided promising results for feature enhancement (v = 0.35 was selected in our study [52]). The number of kernel convolutional masks and maximum pooling (MP) masks was set to 16 in 2nd and 3rd convolutional-pooling layers, respectively. The sizes of kernel convolutional masks and maximum pooling masks were set at 3 × 3 and 2 × 2, respectively. Two kernel convolutional-pooling processes were used to extract the desired object's feature pattern and also reduced the dimensions of the feature patterns with a MP process for obtaining abstract features. Each kernel mask moved the number of columns and rows in steps of 1 (stride = 1) at each convolutional operation. The padding parameter was set to 1 to maintain the feature pattern (padding = 1). Each MP mask moved with a stride of 2 (stride = 2). The MP processes could overcome the overfitting problem for training a multilayer classifier. In the classification layer, for a back-propagation neural network (BPNN) with 1 input layer (625 nodes), 1st hidden layer (168 nodes), 2nd hidden layer (64 nodes), and 1 output layer (3 nodes), an adaptive moment estimation method (ADAM) or a backpropagation algorithm was a gradient descent-based optimization algorithm [52,53] to adjust the BPNN's connecting weighted parameters which was used to determine the optimal parameters to raise the classifier's accuracy.
We selected the ADAM algorithm to train the traditional 2D CNN-based classifier. In addition, we used a multi-core personal computer (PC) (Intel ® Q370, Intel ® Core™ i7 8700, DDR4 2400 MHz 8G*3)-based platform to implement two classifiers, as shown in Table 1, and also used the graphics processing unit (GPU) (NVIDIA ® GeForce ® RTX™ 2080 Ti, 1755 MHz, 11 GB GDDR6) to speed up the CPU execution time for digital image processing and classification tasks. In the MIAS image database [22,23], we selected the image size of 4320 pixels × 2600 pixels (600 dpi for the vertical and horizontal resolutions, a bit depth of 24 bits) for breast lesions screening. According to the MIAS database's biomarkers, the categories and tumor locations could be identified. A total of 118 mammography images (right and left breasts, 59 subjects: 35 normal subjects and 24 abnormal subjects), including 70 normal subjects' images and 48 abnormal subjects' images, were obtained to extract the feature patterns for the training dataset and testing dataset; the training dataset was used to train the classifier for abnormality detection; and the testing dataset was used to validate the classifier performance. This study used a K-fold cross-validation (K f = 10) to evaluate the classifier's performances with the four indices, including precision (%), recall (%), accuracy (%), and F1 score [39][40][41]. The feasibility study was validated as described in detail in the subsequent sections.

Multilayer CNN-Based Classifiers' Training and Validation
Randomly selecting 80 mammography images (20 normal subjects and 20 abnormal subjects) from the MIAS image database, 40 tumor-free images and 40 tumor images were selected to extract the feature patterns for training and testing two classifiers, as shown in Table 2. All the breast lesions in the mammograms were labeled and agreed upon by expert radiologists for biomarkers. We extracted the feature patterns from the enrolled mammography images by using the 2D spatial convolutional process (integral image-and fractional-order-based convolutional processes), two-round 1D or 2D kernel convolutional processes, and a pooling process. For each mammography image (right or left hand side), with a specific bounding box, feature patterns were obtained from the enrolled images; then a total of 500 feature patterns were used to train and validate the multilayer CNN-based classifier. The 200 feature patterns (100 Nors, 50 Bs, and 50 Ms) were randomly selected to train the classifier, and another 150 feature patterns were also randomly selected to validate the classifier's performance. In the classification layer, our proposed classifier's structure was determined by K paired input-output training patterns (K = 200), with 200 comparative feature signals and 200 desired labeled patterns, including Class-Nor, Class-B, and Class-M, which were used to establish a GRA-based fully connected network, with 100 input nodes, 200 GRA nodes, 4 summation nodes, and 3 output nodes (for three classes). As seen in Table 1, in the 2nd and 3rd convolutional layers, two-round 1D convolutional processes used the discrete Gaussian function (with stride = 1) with 200 data length of convolutional mask to extract and enhance the feature signals (as seen in Table 2). In the downsampling layer, the dimension of the feature signal was reduced from 1 × 10,000 to 1 × 100 (with stride = 100). The dimension of 1 × 100 pooling feature signal (as seen in Figure 6) was then fed into the inputs of the GRA-based fully connected network to perform the classification task. The GRA-based classifier carried out a multilayer network with straightforward mathematical operations by using Equations (12) to (16) and dealt with the incoming feature signals to perform the classification task. Its learning stage did not require complex iterative computations, such as the forward-pass and back-propagation algorithm, gradient descent-based algorithm, or swarm optimization algorithm [6,[54][55][56], to adjust the connecting weighted parameters between the input layer and output layer.
We also established a 2D fractional-order CNN-based classifier, consisting of a 2D spatial fractional-order convolutional layer, two-round kernel convolutional-maximum pooling layers, a flattening layer, and a fully connecting classification network, as seen in the structure in Table 1. In the classification layer, the fully connecting network was a back-propagation multilayer network, consisting of an input layer (with 625 input nodes), two hidden layers (with 168 nodes and 64 nodes in the 1st and 2nd hidden layers, respectively), and an output layer (with 3 output nodes). In the training stage, the gradient descent-based ADAM algorithm was used to adjust the connecting weighted parameters between the input layer and the output layer with the forward-pass and back-propagation processes, which were used to minimize loss function by using iterative computations, such as the binary cross-entropy function for multi-label classification [49,50]. In this study, we implemented the 2D fractional-order CNN-based classifier by using the open-source Tensorflow platform (Version 1.9.0) in the Python programming language [57,58]. For the training dataset with 500 feature patterns (290 tumor-free patterns and 210 tumor patterns), Figure 8a,b showed the training history curves for accuracy and loss value for 1000 training epochs in the training stage, as a blue real-line for the training curve and an orange real-line for the validation curve. It can be seen that the training history curve reached saturation over the 400 training epochs in the training stage; thus, a classification accuracy of 97% was obtained and was guaranteed to gradually reach the convergence condition. Finally, the results of the training convergence curve converged, and the value of loss function was 0.091, as seen in the training convergence curve in Figure 8b. In the recalling stage, for the 500 feature patterns, the experimental results of the classifier produced a visual confusion matrix for testing results, with the abnormal pattern yields TP (true positive) = 203 and FP (false positive) = 7 and the normal pattern yields TN (true negative) = 282 and FN (false negative) = 8 in Figure 9, which were used to compute the four evaluation indices of the classifier, including precision (%) = 96.70%, recall (%) = 96.20%, accuracy (%) = 97.00%, and F1 score = 0.9640, respectively. Hence, the above criteria were evaluated to quantify the classifiers' performance. For small-scaled databases, the cross-validation method was used in ML and DL for improving the model's classification performances when we did not have enough datasets to split the training, validation, and testing; through 10-fold (K f = 10) cross-validation tests, for each fold test, we randomly selected 200 feature patterns from datasets for training the both classifiers, and another 200 feature patterns for validating the classifier's performance. The experimental results of the 2D fractional-order CNN-based classifier were shown in Table 3, with an average precision of 95.90% (as the positive predictive value, PPV) and an average recall of 96.10% for identifying the feature patterns for tumor cases (B and M) and also accurately identifying the abnormality (TP), respectively; an average accuracy of 96.00% for correctly identifying the tumor-free feature patterns and tumor feature patterns; and an average F1 score of 0.9599 for evaluating the classifier's performance for accurately separating the normality from abnormality. For each fold test, the classifier's computations took an average of 330.0 s of CPU time to complete the tasks, including the training and testing stages. For the same cross-validation tests with 1D feature signals, as seen in Figure 10 (100 Nors, 50 Bs and 50 Ms), with the 2D spatial and 1D CNN-based classifier, the experimental results were shown in Table 4, with an average precision of 96.70% and an average recall of 96.13% for accurately identifying the tumor cases (TPs); an average accuracy of 96.40% for correctly identifying the normality and abnormality; and an average F1 score of 0.9641 was also greater than 0.9000, the higher the better, which indicated the classifier had a great potential prediction capability for quantifying the classifier model. In addition, the recall (%) as the index of PPV was also greater than 80.00%, which indicated the classifier had a predictive performance for identifying the abnormality (TP). Its pattern recognition scheme took an average of 0.6025 s of CPU time to identify the possible breast lesions. Hence, we recommend the use of the 2D spatial and 1D CNN-based classifier to automatically screen the presence of breast lesions on mammographic images in clinical applications.

Discussion
We developed the 2D spatial and 1D CNN-based classifier with mammographic image classification for screening the disease present in normality (Nor) or abnormality (B and M classes). For the MIAS image database [22,23], with the 10-fold cross-validation tests, as seen in Table 4, the experimental results indicated an average precision of 96.70%, average recall of 96.13%, average accuracy of 96.40%, and average F1 score of 0.9641 to quantify the classification performance for identifying the breast lesions. The performance of the proposed classifier was superior to that of the traditional 2D CNN-based classifier in design cycle, screening accuracy, parameters assignment (including convolutional masks and BPNN's network parameters), parameters adjustment, computational complexity level (iteration computations), and computational time. The BPNN's optimal parameters required determination by the ADAM algorithm in the training stage, and were updated by adjusting the network parameter, decay parameter, learning rate, and attenuation rate to minimize the error rate. Additionally, the classification methods, ML-and DL-based methods, were both used to carry out different classifier models for clinical/medical purposes, including breast density estimation, mass detection/mass segmentation, mammogram classification/breast lesions screening, and automated breast cancer detection [24][25][26][27][28][29]37,38,[59][60][61], as seen in Table 5. ML was based on low-level image features, such as shapes, texture, and local key-point features [24,25,27,59,61], and the supervised ML-based models, such as SVM, ANN, and clustering methods [24,25,59], were used to establish various computer-aided vision classifiers. With the MIAS database, SVM and ANN methods had accuracy rates of 94% and 97.08% for mammogram classification and mass detection, respectively [24,25]. Clustering methods, such as K-means, fuzzy C-means, and GA-based feature selection algorithms [27,61], had accuracy rates of 91.18%, 94.12%, and 84.5% for mass segmentation and mammogram classification, respectively. However, the SVM and ANN required the manual labeled classes and the selected feature patterns to train the classifier, which also required the ongoing human participation and expert intervention to feed new training datasets and continuously model the purposed tasks. Hence, its model needed more datasets to feed the classifier and to confirm the accurate classification or correct response through the designers. In clinical application, over time, its model was able to handle the new dataset to retrain the classifier, resulting in inefficiencies to keep the classifier's performance, and it was not easy to make classifier adjustments on real-time application. Clustering methods (CM) [27,61] were unsupervised learning to help the classifier's complex tasks to deal with large, highly flexible, and unpredicated/unlabeled datasets. However, the CM had no critical standard to evaluate the value of its results or understand whether the classifier's findings were accurate or useful. In contrast to the ML-based methods, the DL-based methods, such as TTCNN, Grad-CAM CNN, DNN (deep neural network), FCN (fully convolutional network), attention dense-Unet, and dense-Unet models [26][27][28][29]37,38,60], had more complex schemes to set up the classifiers with minimal expert interventions, which used the large volumes of unstructured datasets to train a classifier for classification or detection purposes. For example, TTCNN [37] comprised two convolutional layers (with 5 × 5 kernel mask size, 16 and 32 kernel masks) and followed by the MP layers (with 2 × 2 mask size, 16 and 32 MP masks), the 3rd convolutional layer (with 3 × 3 kernel mask size, 64 kernel masks), and the fully connected layer (classification layer). With the DDSM [19,62], INbreast [63], and MIAS [22,23] databases, the TTCNN had accuracy rates of 99.08%, 96.82%, and 96.57% for breast cancer diagnosis and classification, respectively. The Grad-CAM-based CNN, including DenseNet-169 and EfficientNet-B5 [38], could detect malignant lesions in both craniocaudal and mediolateral oblique view images, which highlighted ROI with the red color-coded areas to indicate the positive region for identified suspicious lesions. This visualization manner could locate and identify the abnormalities from mammograms in case of mass or calcification. DenseNet-169 and EfficientNet-B5 had mean accuracy rates of 88.1% and 87.9% for automated breast cancer detection, respectively. Thus, these multiconvolutional-pooling layers were used to select the 2D features for improved image contrast (contrast adjustment), which limited the size of output patterns, and refined the classifier's recognition ability. However, the model's purpose and performance were required to continuously maintain the available training dataset, and the excessive multiconvolutional-pooling processes would decrease the position, orientation, and spatial relationships of the desired object.
ML-based methods could be rapidly established but might be limited in their results for their applications; and DL-based methods required more time to set up the model but could rapidly produce results and had promising classification accuracy with the multiconvolutional processes. In addition, their models required the resource of GPU hardware to perform the multiconvolutional-pooling processes and the classifier's training tasks.
Therefore, we integrated the 2D spatial and 1D CNN-based classifier to simplify the 2D multiconvolutional processes and computational complexity levels. In the classification layer, the GRA-based classifier had straightforward mathematic operations, without optimization/fine-tuning algorithms and iteration computations, to perform the training and pattern recognition tasks. Some advantages of the proposed classifier are shown below:

•
The possible breast lesions' spatial and edge information could be enhanced by the II-based spatial convolutional process in the first convolutional layer, which helped to easily locate ROI and extract feature patterns from the original mammographic image; • The suitable two-round 1D convolutional processes could quantify the different levels, which helped to preliminary separate the Nor from the B and M classes; • The dimension of feature signals could be reduced by the 1D pooling process, which helped to overcome the classifier's overfitting problems in the training stage; • The straightforward mathematic operations performed the training and pattern recognition tasks; • The optimal parameters that were updated in the training stage did not require convergence condition assignment and parameters adjustment; • The determination network parameters did not require complex iteration computations and optimization algorithms.

•
The classification accuracy could be obtained in less computation time and was feasible to replace manual screening with specific expertise and experience.

Conclusions
Routine imaging examinations, such as mammographic image and breast ultrasound imaging, can be used to early detect breast lesions for increased survival rates and then can help save lives. Mammography and breast ultrasound are both first-line manners for performing clinical examinations. However, breast ultrasound imaging has a poor screening capacity for small calcifications detection (as the earliest signs of breast cancer) and is required to combine with diagnostic mammography to evaluate the suspected breast lesions and changes in breast tissues. A breast ultrasound is an assistive tool for screening breast cancer and offers a visual guide for performing a biopsy. For a follow-up screening, with low-dose X-rays to view the breast tissue, where an abnormal screening mammogram is obtained, the clinicians or radiologists can capture more images to inspect suspicious lesions, such as calcifications or small tumors, as the earliest signs of breast cancer. Hence, based on mammographic image classification, the proposed 2D spatial and 1D CNN-based classifier could be directly used to screen breast lesions in clinical applications. In contrast to the ML-and DL-based methods [24][25][26][27][28][29]37,38,[59][60][61], the proposed multilayer classifier had some advantages: (1) spatial convolutional process for enhancing the breast lesions' features; (2) two-round 1D convolutional processes for identifying the differences between normal and B/M classes; and (3) straightforward mathematic operations for performing the training and screening tasks. Through 10-fold cross-validation tests, we obtained promising results for screening breast lesions, with a high classifier mean F1 score (0.9641), precision (96.70%), and recall (96.13%) for separating the Nor from the B and M classes.
However, with the mammographic images, women had background tissue type (especially in Asians), such as higher breast density, which might affect the classification accuracy in mammographic images [38]. As lesions might be shadowed by dense tissues, such as dense breast or intermediate mixed-type breast density, AI-based methods might not identifyaccurately at the early stage, thus increasing patients' risk of developing breast cancer. The proposed screening model has overcome limitations, such as parameters assignment, parameters adjustment, iteration computation, and optimization algorithm requirements. Its training scheme has an adaptive capability to retrain the classifier with new image datasets in less computation time, such as clinical images, the DDSM database, or the INbreast database. Hence, new/special mammographic images were continuously considered for addition to the training datasets, which could rapidly retrain the classifier and maintain its intended medical purpose. Its pattern recognition scheme could be carried out as a computer-aided decision-making tool or a software in a medical device (SaMD) tool [64,65]. Therefore, we suggest the proposed automatic screening model could replace the traditional CNN methods for specific tasks requiring expertise and experience in medical image examinations, such as diagnostic mammogram, CT, and MRI, which will help to reduce the burden and to focus on follow-up decision making and medical strategies.