1. Introduction
Artificial Intelligence may help in the detection and diagnosis of any disease. Moreover, in cancer diseases, early detection is essential to prevent the spread of cancer in the body, resulting in the patient’s death. Breast cancer is one of the most aggressive types of cancer, and is responsible for almost 685,000 deaths in females worldwide [
1]. In Mexico, breast cancer has increased between the years from 2013 to 2016, with 24,695 women deaths [
2]. Thus, an early diagnosis is critical for breast cancer survival [
3]. Screening mammography is the preferred early detection strategy for reducing breast cancer mortality [
4]. Mammography screening has had a positive impact in about
of breast cancer detection [
5]. On the other hand, the CAD’s systems try to emulate the process realized by the radiologist for detecting the cancer. Detection of early breast cancer signals is a routine and repetitive procedure. From the typical radiologist breast cancer subjects, only
of the cases are malignant [
6].
Aiming to reduce the load of work for the radiologist, computer-aided detection (CAD) systems are designed to assess the radiologist, as a second opinion, and it may aid in the correct interpretation of suspicious findings [
7,
8,
9,
10]. This process is not a trivial task due to the heterogeneity of abnormalities and the darkening under dense masses, making it difficult to identify a possible breast cancer. Mammography analysis helps to analyze the internal structure of the breast, with the aim of studying the tissues and injuries such as nodules, classifications, asymmetries in breast density and distortion of the architecture of the breast [
11,
12,
13,
14]. The features seek to provide information about the shape, contour, density, and perimeter and correspond to the input of an artificial intelligence system to classify the lesion into benign or cancer [
15]. The relationship between breast lesion analysis and morphological description has been widely investigated [
16]. Other research has been focused on extracting features with the viogram function as a texture feature descriptor. Other features are extracted using a new hybrid of scheme of texture, co-occurrence matrix and geometric features with a neural network [
17].
On the other hand, pyRadiomics is a tool for medical imaging that allows feature extraction. The pyRadiomics toolkit was used for tissue characterization investigated by Granzier [
18]. Gao et al. have performed a similar series of experiments using the pyRadiomics platform for prediction of the auxiliary lymph node tumor burden in breast cancer patients [
19]. Vamvakas investigated the utility of boosting ensemble classification methods for increasing the diagnostic in differentiating benign and malignant breast lesions [
20]. The fist idea is to reduce the quantity of features, which give the benefit to obtain low computational costs. For example, in the investigation proposed by Galván-Tejada et al. [
21]. Galvan proposed a multivariate model that classifies the lesion into benign or malignant tumors using a genetic algorithm that analyze the morphological characteristics of the lesions to obtain an optimal classification. Genetic algorithms as an optimization tool for feature selection models have been revealed as an efficient technique using a computer-assisted diagnosis, so this approach will be also used in this investigation [
22,
23,
24].
Refs. [
25,
26]: Reports demonstrate that ML models allow one to reduce false positives when classifying lesions, using optimization techniques on images. Moreover, some cross-sectional studies suggest an association between fatty and fatty-glandular for the analysis of mammography using a set of micro calcification features. Other research, which is based on texture description, spectral clustering, and Support Vector Machine (SVM) for the detection of breast masses [
27], also aims to obtain more informative features. Other multivariate analysis approaches have demonstrated that prognostic information and predictive factors can be obtained to identify breast cancer in its early stages [
28]. Among the different techniques of digital image processing and pattern recognition that have been applied in breast cancer, the use of mutual information and a greedy selection are used for this diagnosis when the information is uniformly distributed [
29]. The feature selection for classifying benign and malignant lesions could also be made by using standard classification algorithms such as: K-nearest neighbors (KNN), decision trees, and naive Bayes [
30]. On the other hand, Haralick et al. [
31] introduced for the time the concept of Co-Occurrence Matrix (GLCM) for the analysis of texture patterns and their spatial classification. These relationships are specified in the built-in co-occurrence matrix for breast texture classification, since in recent works, the co-occurrence matrix for texture classification in breast images has been incorporated [
32]; this concept will be also considered in our proposition.
On the other hand, Tsochatzidis et al. investigated the performance of multiple networks for breast cancer diagnosis from mammograms with mass lesions [
33]. The incorporation of a margin-specific content-based image retrieval approach into a computer-aided diagnosis scheme of mammographic masses is investigated for the same authors in [
34]. Andrik proposed a method, which is based on AlexNet with some modifications and has been adapted to our classification problem [
35]. A deep ensemble transfer learning and neural network classifier for automatic feature extraction and classification was proposed by Aurora [
36]. It should be mentioned that the authors also work with the CBIS-DDSM images. Furthermore, three data sets investigated a CAD system based on deep Convolutional Neural Networks (CNN) for classifying mammography mass lesions [
37].
Feature analysis plays an important role in developing a specialized software for extracting the key features and building a robust classification scheme; numerous experiments have been implemented in the pyRadiomic system [
38]. In this work, a predictive model was implemented for the detection of lesions in calcification to classify between the benign and malignant breast. The focus of the work is to speed up the diagnosis of breast cancer using the genetic algorithm and PyRadiomics System. Subsequently, the diagnosis can be confirmed by radiology through workflow.
The remainder of the paper is organized according to the following sections: The first section of this paper will examine other investigations in the literature
Section 1.
Section 2 describes the materials and methods. The experimental design is presented in
Section 3. The results and discussion are presented in the
Section 4 and
Section 5. Finally, the last gives a conclusion in
Section 6.
3. Experimental Setup
In this research, some independent studies were used to explore between two types of breast lesion, benign or malignant, in images of calcification and mass. Left or right breast images with suspicious regions were only selected in the proposed experiments; a total of 400 left and right breast mammograms were used with the CC projection. For the calcification, the first sub-set (CS1) was obtained by using only the data contained inside the ROI segmentation provided by the radiologist; then, for the second classification sub-set (CS2), the whole breast segmentation was obtained. The same process was also used for the both sub-sets of the mass data set (MS1, MS2) (as shown in
Figure 4).
The segmentation process is used to eliminate artifacts and labels from the mammogram image, and to select the breast ROI. A threshold value was used to extract the binary mask. Moreover, some morphological operations were applied for the segmentation mask to finally obtain the region of interest of the breast. The process feature extraction on images was realized using the PyRadiomics System. The PyRadiomics required the image and the mask input; for these experimental results, the cases CS1 and MS1 were used for the mask provided by the radiologist. On the other hand, in the MS1 and MS2 cases, the mask breast segmentation was used.
Once the mammography features were extracted by PyRadiomics, 141 features were selected with the basis of texture information from the lesion and from the breast segmentation, and the 21 shape descriptors were removed. Gray Level Co-occurrence Matrix, Gray level Run Length Matrix, Gray level Size Zone Matrix, Neighbouring Gray Tone Difference Matrix, and Gray level feature was selected for this experiment.
Then, in order to select the best features to construct a robust model, a feature selection process was implemented into two stages; in the first one, the no-variance features were removed, then on the second stage, a genetic algorithm (GALGO) [
23] was used to search for the best combination of features that correctly classify the samples.
Then, a validation was carried out by means of cross-validation for each CS1, CS2, MS1 and MS2 sub-sets. A cross-validation with a strategy was used, then a series of metrics were computed in order to assess the performance of the models on unseen data for this, and the AUC, sensitivity, specificity, and accuracy were calculated. Firstly, we shuffled the data set to make up k different sub-sets for the training and test phases.
4. Results
In this section, some results are obtained considering four sub-set cases of images with mass and calcification. This process allows one to read DICOM images converted into a binary image from a gray level. The experiment consisted of 400 images; two types of malignant and benign lesions between right or left images are considered for all cases, ROI segmentation is provided by the radiologist and breast segmentation is obtained according to the proposed methodology.
The breast segmentation process was based on contour detection; first, the algorithm finds all the objects inside the input image, then the area containing such objects is computed, next, the biggest area is selected as a candidate for the breast organ. Once the breast organ is selected, all other objects are eliminated leaving only the breast organ. Nevertheless, several of the input images have noise or unwanted tissue on the frame boundary, and to eliminate such artifacts, 5% of the edge of the image is removed, creating a segmentation mask that only contains breast tissue; the
Figure 9 shows an example of this process.
To start with the feature extraction, the four groups CS1, CS2, MS1 and MS2 and their corresponding binary masks were selected as the input for the pyRadiomics system. The pyRadiomics process extracted 110 features; these features were related to the shape, and those with zero variance were removed, giving a grand total of 88 texture features. The GA (Galgo) algorithm analyzes different models obtained through evolution, with a maximum of 300 generations. The obtained models from the evolution process of the algorithm are shown in the
Figure 10,
Figure 11,
Figure 12 and
Figure 13. Horizontal axis genes ordered by rank and vertical axis shows the gene frequency and the colour-coded rank of each gene in previous evolutions. Changes in ranks are marked by different colours. These figures summarize the population of chromosomes within each generation, where the black color represents the most stable chromosome in all generated models.
In
Figure 11 and
Figure 13, seven black stable chromosomes were generated for ROI segmentation. However, for segmentation by the radiologist, as shown in
Figure 10 and
Figure 12, seven black stable chromosomes were obtained. Finally,
Table 3 and
Table 4 show a comparison of chromosomes generated in each model.
The global AUC criteria was also calculated by taking the average of all implemented models.
Table 5 shows a comparison between the results of the experiments with the CS1 and the CS2 data set.
The same comparison process as above is performed but now using the mass data set, as shown in
Table 6. Features of black color represent the importance of predicting cancer.
In
Table 7, the best predictors for the classification between benign or malignant using logistic regression for each CS1, CS2, MS1 and MS2 models are shown.
Moreover, to validate the results obtained with the proposed methodology, the accuracy and AUC results are compared with other proposals; the results are shown in
Table 8.
5. Discussion
Results obtained when using the sub-sets CS1, CS2, MS1 and MS2 to classify calcification and masses were as good as it could be expected, which means, for example, that the obtained AUC was at least 0.8 for calcification and at least 0.9 for masses. The whole predictive measures obtained by the data set of calcification and mass between regions of interest are shown in
Table 5 and
Table 6. As shown, the predictive accuracy between the data set of CS1 is 86% and CS2 is 76%. The minimal difference is 12% according to the two models to predict malignant or benign images. On the other hand, for the results from
Table 6, the predictive accuracy between the models MS1 is 95% and MS2 is 74%. In the comparison between the two previous models, the difference was 22% in accuracy. Finally, the evidence suggests that the prediction model CS2 (Calcification) has a higher probability of predicting the MS2 (Malignant) with a percentage of 10% of error.
According to
Table 7, the results demonstrate that, for classification purposes, the measures of GLCM Difference Entropy, GLCM Contrast and GLCM Difference Variance are strongly correlated in the cases of CS1, MS1, and MS2 models. The relation between CS1 and MS2 models is given with NGTDM Business features. Finally, CS1 and MS1 models are correlated by GLCM Id and First Order Total Energy features. GLCM Difference Entropy is other measure of correlation that presents MS1 and MS2 cases. This experiment demonstrates that the GLCM class provides strong prediction measures to classify between malignant or benign class models. The most important result that emerges from the analysis in this section is the relationship between breast mass and cancer, and, respectively, between breast calcification and cancer; there are three radiomic features from the classes such as GLRLM, GLSZM, and GLCM, which are considered stable. Another advantage of the selection procedure used in the proposed methodology, is the dimensionality reduction with a 20% in the generation of a new optimal model.
The results provided in
Table 8 give a comparison with respect to other state-of-the-art methodologies. In order to observe the veracity of the proposed methodology, some comparisons are made with respect to the other four methods, which employ the same database used in this research (CBIS_DDSM). It is important to say that these methods evaluate benign and malignant lesions according to calcification and mass mammograms images using two projections, MLO and CC. The obtained results with the proposed methodology outperform those results reported by [
34,
36], for benign and malignant lesions, for example for the MS1 case, and the area under the curve (AUC) given by our proposition is about 0.95 and 0.96 of accuracy. The AIC score (CS1, CS2, MS1 and MS2) is given for the MS2 model with 166—the lowest score as the best.
Feature extraction provides information for classifying breast lesions, and it is possible to make a good feature selection using logistic regression classification based on the texture image. This study found that the mass provides more information for classification, but the calcifications do not necessarily give more information. The calcifications could be segmented and, subsequently, features were extracted. The relationship between mass, calcification and cancer has the best classification rates when it is evaluated by the Gray Level Co-occurrence Matrix.
On the other hand, the image analysis performed by [
35] also evaluates two types of lesions for calcification and masses with two projections, obtaining an AUC of 0.84 and 0.8 of accuracy. Moreover, in this comparison for the CS1 case, the proposed method gives a better result, since it obtained an AUC of 0.86 and 0.82 of accuracy. However, the AIC score (CS1, CS2, MS1 and MS2) is given for the MS2 model with 166—the lowest score as the best.
It has been demonstrated that the CC projection analysis provide the best information for the benign and malignant lesions classification, making an optimal feature extraction from the mammal tissue.
6. Conclusions
The detection of breast cancer at an early stage can be prevented from spreading to other parts of the body or avoiding death in the patient. The integration of predictive models in the diagnosis of breast cancer have allowed the radiologist to make quick decisions. Comparing a lesion breast analysis realized by a radiologist and the segmentation of the breast on mammography made by the classification models implemented in this work, there is no substantial difference in decision making. The implementation of genetic algorithms was considered in order to help to choose the best predictors in the detection of breast cancer; the results of the models implemented have a 86% AUC for calcification models and 95% of AUC for mass models.
Although there is much research focused mainly on finding the region of interest, this type of analysis would allow finding types of lesions in a very restricted area. In this new methodology, we propose an automated segmentation based on the analysis of the whole breast region to classify between benign and malignant lesions. The results demonstrate that between the lesion and the whole breast there is around a 10% of difference for cases of calcifications, and a 20% of difference in the case of masses. Based on the previous results, the radiologist would focus on the cases where the system finds malignant cases, and carry out a more in-depth study of the case. Our proposal allows us to speed up the work of the radiologist in decision-making.
The purpose of the present investigation is not to change the opinion of the radiologist, but to motivate the use of an alternative tool that allows one to improve the response time of the analysis in the detection of malignant or benign lesions in images with calcification or mass. The Pyradiomics system provided optimal features for a good classification. However, this system is limited by both the processing speed and the amount of memory available.