# Breast Tumor Classification Using an Ensemble Machine Learning Method

^{1}

^{2}

^{3}

^{4}

^{5}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Related Work

## 3. Methodology

#### 3.1. Classification Methods

#### 3.1.1. Simple Logistic Regression Model

#### 3.1.2. SVM Learning with Stochastic Gradient Descent (SGD) Optimization

#### 3.1.3. Multilayer Perceptron Network

#### 3.1.4. Random Decision Tree

#### 3.1.5. Random Decision Forest

#### 3.1.6. SVM Learning with Sequential Minimal Optimization (SMO)

#### 3.1.7. K-Nearest Neighbor Classification

#### 3.1.8. Naïve Bayes Classification

#### 3.2. Voting Mechanism

#### 3.2.1. Majority-Based Voting Mechanism (Hard Voting)

#### 3.2.2. Soft Voting

## 4. Results

#### Performance Evaluation Measures

## 5. Discussion

## 6. Comparison with Existing Work

## 7. Conclusions

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## References

- Chagpar, A.B.; Coccia, M. Factors associated with breast cancer mortality-per-incident case in low-to-middle income countries (LMICs). J. Clin. Oncol.
**2019**, 37, 15. [Google Scholar] [CrossRef] - Sharma, G.N.; Dave, R.; Sanadya, J.; Sharma, P.; Sharma, K. Various types and management of breast cancer: An overview. J. Adv. Pharm. Technol. Res.
**2010**, 1, 109. [Google Scholar] [PubMed] - Turkki, R.; Byckhov, D.; Lundin, M.; Isola, J.; Nordling, S.; Kovanen, P.E.; Verrill, C.; von Smitten, K.; Joensuu, H.; Lundin, J.; et al. Breast cancer outcome prediction with tumour tissue images and machine learning. Breast Cancer Res. Treat.
**2019**, 177, 41–52. [Google Scholar] [CrossRef] [PubMed][Green Version] - Guo, Y.; Shang, X.; Li, Z. Identification of cancer subtypes by integrating multiple types of transcriptomics data with deep learning in breast cancer. Neurocomputing
**2019**, 324, 20–30. [Google Scholar] [CrossRef] - Golden, J.A. Deep learning algorithms for detection of lymph node metastases from breast cancer: Helping artificial intelligence be seen. JAMA
**2017**, 318, 2184–2186. [Google Scholar] [CrossRef] - Li, L.; Pan, X.; Zhang, L. Multi-task deep learning for fine-grained classification and grading in breast cancer histopathological images. Multimed. Tools Appl.
**2018**, 810, 85–95. [Google Scholar] [CrossRef] - Zhu, Z.; Albadawy, E.; Saha, A.; Zhang, J.; Harowicz, M.R.; Mazurowski, M.A. Deep learning for identifying radiogenomic associations in breast cancer. Comput. Biol. Med.
**2019**, 109, 85–90. [Google Scholar] [CrossRef][Green Version] - Bejnordi, B.E.; Veta, M.; Van Diest, P.J.; Van Ginneken, B.; Karssemeijer, N.; Litjens, G.; Van Der Laak, J.A.; Hermsen, M.; Manson, Q.F.; Balkenhol, M.; et al. Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. JAMA
**2017**, 318, 2199–2210. [Google Scholar] [CrossRef] - Bi, W.L.; Hosny, A.; Schabath, M.B.; Giger, M.L.; Birkbak, N.J.; Mehrtash, A.; Allison, T.; Arnaout, O.; Abbosh, C.; Dunn, I.F.; et al. Artificial intelligence in cancer imaging: Clinical challenges and applications. CA Cancer J. Clin.
**2019**, 69, 127–157. [Google Scholar] [CrossRef][Green Version] - Lamy, J.B.; Sekar, B.; Guezennec, G.; Bouaud, J.; Séroussi, B. Explainable artificial intelligence for breast cancer: A visual case-based reasoning approach. Artif. Intell. Med.
**2019**, 94, 42–53. [Google Scholar] [CrossRef] - Shen, L.; Margolies, L.R.; Rothstein, J.H.; Fluder, E.; McBride, R.; Sieh, W. Deep learning to improve breast cancer detection on screening mammography. Sci. Rep.
**2019**, 9, 1–12. [Google Scholar] [CrossRef] [PubMed] - Coccia, M. Deep learning technology for improving cancer care in society: New directions in cancer imaging driven by artificial intelligence. Technol. Soc.
**2020**, 60, 101198. [Google Scholar] [CrossRef] - Khan, S.; Islam, N.; Jan, Z.; Din, I.U.; Rodrigues, J.J.C. A novel deep learning based framework for the detection and classification of breast cancer using transfer learning. Pattern Recognit. Lett.
**2019**, 125, 1–6. [Google Scholar] [CrossRef] - Wang, H.; Yoon, S.W. Breast cancer prediction using data mining method. In Proceedings of the IIE Annual Conference Expo 2015, Nashville, TN, USA, 30 May–2 June 2015; pp. 818–828. [Google Scholar]
- Nguyen, Q.H.; Do, T.T.; Wang, Y.; Heng, S.S.; Chen, K.; Ang, W.H.M.; Philip, C.E.; Singh, M.; Pham, H.N.; Nguyen, B.P.; et al. Breast Cancer Prediction using Feature Selection and Ensemble Voting. In Proceedings of the 2019 International Conference on System Science and Engineering (ICSSE), Dong Hoi City, Vietnam, 20–21 July 2019; pp. 250–254. [Google Scholar]
- Ahmad, L.G.; Eshlaghy, A.; Poorebrahimi, A.; Ebrahimi, M.; Razavi, A. Using three machine learning techniques for predicting breast cancer recurrence. J. Health Med. Inf.
**2013**, 4, 3. [Google Scholar] - Nazir, S.; Ghazanfar, M.A.; Aljohani, N.R.; Azam, M.A.; Alowibdi, J.S. Data analysis to uncover intruder attacks using data mining techniques. In Proceedings of the 2017 5th International Conference on Information and Communication Technology (ICoIC7), Melaka, Malaysia, 17–19 May 2017; pp. 1–6. [Google Scholar]
- Mandal, S.K. Performance analysis of data mining algorithms for breast cancer cell detection using Naïve Bayes, logistic regression and decision tree. Int. J. Eng. Comput. Sci.
**2017**, 6, 20388–20391. [Google Scholar] - Borges, L.R. Analysis of the Wisconsin Breast Cancer Dataset and Machine Learning for Breast Cancer Detection. Group
**1989**, 1, 369. [Google Scholar] - Chaurasia, V.; Pal, S.; Tiwari, B. Prediction of benign and malignant breast cancer using data mining techniques. J. Algorithms Comput. Technol.
**2018**, 12, 119–126. [Google Scholar] [CrossRef][Green Version] - Kumar, V.; Mishra, B.K.; Mazzara, M.; Verma, A. Prediction of Malignant & Benign Breast Cancer: A Data Mining Approach in Healthcare Applications. arXiv
**2019**, arXiv:1902.03825. [Google Scholar] - Lee, S.; Amgad, M.; Masoud, M.; Subramanian, R.; Gutman, D.; Cooper, L. An Ensemble-based Active Learning for Breast Cancer Classification. In Proceedings of the 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), San Diego, CA, USA, 18–21 November 2019; pp. 2549–2553. [Google Scholar]
- Abdar, M.; Makarenkov, V. CWV-BANN-SVM ensemble learning classifier for an accurate diagnosis of breast cancer. Measurement
**2019**, 146, 557–570. [Google Scholar] [CrossRef] - Alam, K.M.R.; Siddique, N.; Adeli, H. A dynamic ensemble learning algorithm for neural networks. Neural Comput. Appl.
**2019**, 1–16. [Google Scholar] [CrossRef][Green Version] - Osman, A.H.; Aljahdali, H.M. An Effective of Ensemble Boosting Learning Method for Breast Cancer Virtual Screening using Neural Network Model. IEEE Access
**2020**, 8, 39165–39174. [Google Scholar] - Landwehr, N.; Hall, M.; Frank, E. Logistic model trees. Mach. Learn.
**2005**, 59, 161–205. [Google Scholar] [CrossRef][Green Version] - Sumner, M.; Frank, E.; Hall, M. Speeding up logistic model tree induction. In Proceedings of the European Conference on Principles of Data Mining and Knowledge Discovery, Porto, Portugal, 3–7 October 2005; pp. 675–683. [Google Scholar]
- Bottou, L. Large-scale machine learning with stochastic gradient descent. In Proceedings of COMPSTAT’2010; Springer: Berlin/Heidelberg, Germany, 2010; pp. 177–186. [Google Scholar]
- Pal, S.K.; Mitra, S. Multilayer perceptron, fuzzy sets, and classification. IEEE Trans. Neural Netw.
**1992**, 3, 683–697. [Google Scholar] [CrossRef] [PubMed] - Rokach, L. Decision forest: Twenty years of research. Inf. Fusion
**2016**, 27, 111–125. [Google Scholar] [CrossRef] - Daho, M.E.H.; Chikh, M.A. Combining bootstrapping samples, random subspaces and random forests to build classifiers. J. Med. Imaging Health Inform.
**2015**, 5, 539–544. [Google Scholar] [CrossRef] - Khedr, A.E.; Idrees, A.M.; El Seddawy, A.I. Enhancing Iterative Dichotomiser 3 algorithm for classification decision tree. Wiley Interdiscip. Rev. Data Min. Knowl. Discov.
**2016**, 6, 70–79. [Google Scholar] [CrossRef] - She, J.; Schmidt, M. Linear convergence and support vector identification of sequential minimal optimization. In Proceedings of the 10th NIPS Workshop on Optimization for Machine Learning, Long Beach, CA, USA, 8 December 2017; p. 5. [Google Scholar]
- Nazir, S.; Yousaf, M.H.; Velastin, S.A. Inter and intra class correlation analysis (IICCA) for human action recognition in realistic scenarios. In Proceedings of the 8th International Conference on Pattern Recognition Systems (ICPRS), Madrid, Spain, 11–13 July 2017. [Google Scholar]
- Ibarra, J.B.; Caya, M.V.C.; Bentir, S.A.P.; Paglinawan, A.C.; Monta, J.J.; Penetrante, F.; Mocon, J.; Turingan, J. Development of the Low Cost Classroom Response System Using Test-Driven Development Approach and Analysis of the Adaptive Capability of Students Using Sequential Minimal Optimization Algorithm. In Proceedings of the 2019 IEEE 6th International Conference on Industrial Engineering and Applications (ICIEA), Tokyo, Japan, 12–15 April 2019; pp. 689–693. [Google Scholar]
- Nazir, S.; Yousaf, M.H.; Velastin, S.A. Feature Similarity and Frequency-Based Weighted Visual Words Codebook Learning Scheme for Human Action Recognition. In Proceedings of the Pacific-Rim Symposium on Image and Video Technology, Wuhan, China, 20–27 November 2017; pp. 326–336. [Google Scholar]
- Nazir, S.; Yousaf, M.H.; Velastin, S.A. Evaluating a bag-of-visual features approach using spatio-temporal features for action recognition. Comput. Electr. Eng.
**2018**, 72, 660–669. [Google Scholar] - Al-Sabbah, S.A.; Mohammad, S.F.; Eanad, M.M. Use of the Naive Bayes Function and the Models of Artificial Neural Networks to Classify Some Cancer Tumors. Indian J. Public Health Res. Dev.
**2019**, 10, 1563–1569. [Google Scholar] [CrossRef] - Delgado, J.; Ishii, N. Memory-based weighted majority prediction. In Proceedings of the SIGIR Workshop Recommender Systems, Berkeley, CA, USA, 19 August 1999. [Google Scholar]
- Kang, X.B.; Lin, G.F.; Chen, Y.J.; Zhao, F.; Zhang, E.H.; Jing, C.N. Robust and secure zero-watermarking algorithm for color images based on majority voting pattern and hyper-chaotic encryption. Multimed. Tools Appl.
**2019**, 79, 1169–1202. [Google Scholar] [CrossRef] - Du, K.L.; Swamy, M. Combining Multiple Learners: Data Fusion and Ensemble Learning. In Neural Networks and Statistical Learning; Springer: Berlin/Heidelberg, Germany, 2019; pp. 737–767. [Google Scholar]
- UCI. Breast Cancer Wisconsin Dataset. Available online: https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic) (accessed on 9 June 2019).
- Nahato, K.B.; Harichandran, K.N.; Arputharaj, K. Knowledge mining from clinical datasets using rough sets and backpropagation neural network. Comput. Math. Methods Med.
**2015**, 2015, 460189. [Google Scholar] [PubMed] - Chen, H.L.; Yang, B.; Liu, J.; Liu, D.Y. A support vector machine classifier with rough set-based feature selection for breast cancer diagnosis. Expert Syst. Appl.
**2011**, 38, 9014–9022. [Google Scholar] [CrossRef] - Kumari, M.; Singh, V. Breast Cancer Prediction system. Procedia Comput. Sci.
**2018**, 132, 371–376. [Google Scholar] - Dumitru, D. Prediction of recurrent events in breast cancer using the Naive Bayesian classification. Ann. Univ. Craiova-Math. Comput. Sci. Ser.
**2009**, 36, 92–96. [Google Scholar] - Liu, L.; Deng, M. An evolutionary artificial neural network approach for breast cancer diagnosis. In Proceedings of the 2010 Third International Conference on Knowledge Discovery and Data Mining, Phuket, Thailand, 9–10 January 2010; pp. 593–596. [Google Scholar]
- Shaikh, T.A.; Ali, R. Applying Machine Learning Algorithms for Early Diagnosis and Prediction of Breast Cancer Risk. In Proceedings of the 2nd International Conference on Communication, Computing and Networking, Islamabad, Pakistan, 6–7 March 2019; pp. 589–598. [Google Scholar]
- Alickovic, E.; Subasi, A. Normalized Neural Networks for Breast Cancer Classification. In Proceedings of the International Conference on Medical and Biological Engineering, Banja Luka, Bosnia and Herzegovina, 16–18 May 2019; pp. 519–524. [Google Scholar]
- Kaushik, D.; Kaur, K. Application of Data Mining for high accuracy prediction of breast tissue biopsy results. In Proceedings of the 2016 Third International Conference on Digital Information Processing, Data Mining, and Wireless Communications (DIPDMWC), New York, NY, USA, 16–19 July 2016; pp. 40–45. [Google Scholar]

**Figure 1.**An ensemble method based on majority-based voting mechanism for breast cancer tumor classification using different machine learning models.

**Figure 2.**Feature visualization result for WBCD (x-axis represents the attribute value and y-axis represents the frequency of each value.

Classification Algorithm | Computational Time (s) |
---|---|

Simple logistic regression model | 0.34 |

SVM learning with SGD optimization | 0.13 |

Multilayer perceptron network | 3.10 |

Random decision tree method | 0.01 |

Random decision forest method | 0.06 |

SVM learning with SMO | 0.30 |

K-nearest neighbor classification | 0.01 |

Naïve Bayes classification | 0.08 |

Classification Algorithms | Accuracy | Precision | Recall | F1 Score | F2 Score | F3 Score |
---|---|---|---|---|---|---|

Simple Logistic Regression Learning | 98.25% | 0.9830 | 0.9820 | 0.9825 | 0.9822 | 0.9821 |

SVM learning with SGD optimization | 97.88% | 0.9791 | 0.9789 | 0.9710 | 0.9710 | 0.9710 |

Multilayer Perceptron Network | 97.66% | 0.9770 | 0.9770 | 0.9770 | 0.9770 | 0.9770 |

Random Decision Tree Method | 91.81% | 0.9200 | 0.9180 | 0.9190 | 0.9184 | 0.9182 |

Random Decision Forest Method | 96.49% | 0.9650 | 0.9650 | 0.9650 | 0.9650 | 0.9650 |

SVM learning with SMO | 97.08% | 0.9710 | 0.9710 | 0.9710 | 0.9710 | 0.9710 |

K-Nearest Neighbor Classification | 97.08% | 0.9710 | 0.9710 | 0.9710 | 0.9710 | 0.9710 |

Naïve Bayes Classification | 91.81% | 0.9190 | 0.9180 | 0.9185 | 0.9182 | 0.9181 |

Voting Mechanism | Accuracy | Precision | Recall | F1 Score | F2 Score | F3 Score |
---|---|---|---|---|---|---|

Majority-based | 99.42% | 0.9940 | 0.9940 | 0.994 | 0.9940 | 0.9940 |

Average of probabilities | 98.83% | 0.989 | 0.988 | 0.9885 | 0.9882 | 0.9881 |

Product of probabilities | 98.12% | 0.9850 | 0.9850 | 0.9850 | 0.9850 | 0.9850 |

Minimum of probabilities | 98.46% | 0.986 | 0.981 | 0.9835 | 0.9820 | 0.9815 |

Maximum of probabilities | 99.41% | 0.9840 | 0.9840 | 0.9840 | 0.9840 | 0.9840 |

1l Work | Proposed Method | Accuracy |
---|---|---|

Ours | Majority-based voting mechanism | 99.42% (70:30), 98.77% (10-CV) |

Nahato et al. [43] | Backpropagation neural network | 98.60% (80:20) |

Liu et al. [47] | An evolutionary artificial neural network | 97.38% (60:40) |

Chen et al. [44] | A support vector machine classifier with rough set-based feature selection | 89.20% (70:30) |

Kumari et al. [45] | K-Nearest neighbor classification algorithm | 99.28% (10-CV) |

Dumitru et al. [46] | Naïve bayesian classification | 74.24% (-) |

Shaikh et al. [48] | Dimensionality reduction and support vector machine | 97.91% (-) |

Nguyen et al. [15] | Feature selection and ensemble voting | 98.00% (10-CV) |

Alickovic et al. [49] | Normalized multi layer perceptron neural network | 99.27% (-) |

Osman et al. [25] | Ensemble learning using Radial Based Function Neural Network models (RBFNN) | 97.00% (10-CV) |

Kaushik et al. [50] | Ensemble learning via MLP, RF and RT | 93.50% (10-CV) |

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Assiri, A.S.; Nazir, S.; Velastin, S.A. Breast Tumor Classification Using an Ensemble Machine Learning Method. *J. Imaging* **2020**, *6*, 39.
https://doi.org/10.3390/jimaging6060039

**AMA Style**

Assiri AS, Nazir S, Velastin SA. Breast Tumor Classification Using an Ensemble Machine Learning Method. *Journal of Imaging*. 2020; 6(6):39.
https://doi.org/10.3390/jimaging6060039

**Chicago/Turabian Style**

Assiri, Adel S., Saima Nazir, and Sergio A. Velastin. 2020. "Breast Tumor Classification Using an Ensemble Machine Learning Method" *Journal of Imaging* 6, no. 6: 39.
https://doi.org/10.3390/jimaging6060039