# Breast Tumor Classification Using an Ensemble Machine Learning Method

## 1. Introduction

## 2. Related Work

## 3. Methodology

#### 3.1. Classification Methods

#### 3.1.1. Simple Logistic Regression Model

#### 3.1.2. SVM Learning with Stochastic Gradient Descent (SGD) Optimization

#### 3.1.3. Multilayer Perceptron Network

#### 3.1.4. Random Decision Tree

#### 3.1.5. Random Decision Forest

#### 3.1.6. SVM Learning with Sequential Minimal Optimization (SMO)

#### 3.1.7. K-Nearest Neighbor Classification

#### 3.1.8. Naïve Bayes Classification

#### 3.2. Voting Mechanism

#### 3.2.1. Majority-Based Voting Mechanism (Hard Voting)

#### 3.2.2. Soft Voting

## 4. Results

#### Performance Evaluation Measures

## 5. Discussion

## 6. Comparison with Existing Work

## 7. Conclusions

**Figure 1.**An ensemble method based on majority-based voting mechanism for breast cancer tumor classification using different machine learning models.

**Figure 2.**Feature visualization result for WBCD (x-axis represents the attribute value and y-axis represents the frequency of each value.

Classification Algorithm | Computational Time (s) |
---|---|

Simple logistic regression model | 0.34 |

SVM learning with SGD optimization | 0.13 |

Multilayer perceptron network | 3.10 |

Random decision tree method | 0.01 |

Random decision forest method | 0.06 |

SVM learning with SMO | 0.30 |

K-nearest neighbor classification | 0.01 |

Naïve Bayes classification | 0.08 |

Classification Algorithms | Accuracy | Precision | Recall | F1 Score | F2 Score | F3 Score |
---|---|---|---|---|---|---|

Simple Logistic Regression Learning | 98.25% | 0.9830 | 0.9820 | 0.9825 | 0.9822 | 0.9821 |

SVM learning with SGD optimization | 97.88% | 0.9791 | 0.9789 | 0.9710 | 0.9710 | 0.9710 |

Multilayer Perceptron Network | 97.66% | 0.9770 | 0.9770 | 0.9770 | 0.9770 | 0.9770 |

Random Decision Tree Method | 91.81% | 0.9200 | 0.9180 | 0.9190 | 0.9184 | 0.9182 |

Random Decision Forest Method | 96.49% | 0.9650 | 0.9650 | 0.9650 | 0.9650 | 0.9650 |

SVM learning with SMO | 97.08% | 0.9710 | 0.9710 | 0.9710 | 0.9710 | 0.9710 |

K-Nearest Neighbor Classification | 97.08% | 0.9710 | 0.9710 | 0.9710 | 0.9710 | 0.9710 |

Naïve Bayes Classification | 91.81% | 0.9190 | 0.9180 | 0.9185 | 0.9182 | 0.9181 |

Voting Mechanism | Accuracy | Precision | Recall | F1 Score | F2 Score | F3 Score |
---|---|---|---|---|---|---|

Majority-based | 99.42% | 0.9940 | 0.9940 | 0.994 | 0.9940 | 0.9940 |

Average of probabilities | 98.83% | 0.989 | 0.988 | 0.9885 | 0.9882 | 0.9881 |

Product of probabilities | 98.12% | 0.9850 | 0.9850 | 0.9850 | 0.9850 | 0.9850 |

Minimum of probabilities | 98.46% | 0.986 | 0.981 | 0.9835 | 0.9820 | 0.9815 |

Maximum of probabilities | 99.41% | 0.9840 | 0.9840 | 0.9840 | 0.9840 | 0.9840 |

1l Work | Proposed Method | Accuracy |
---|---|---|

Ours | Majority-based voting mechanism | 99.42% (70:30), 98.77% (10-CV) |

Nahato et al. [43] | Backpropagation neural network | 98.60% (80:20) |

Liu et al. [47] | An evolutionary artificial neural network | 97.38% (60:40) |

Chen et al. [44] | A support vector machine classifier with rough set-based feature selection | 89.20% (70:30) |

Kumari et al. [45] | K-Nearest neighbor classification algorithm | 99.28% (10-CV) |

Dumitru et al. [46] | Naïve bayesian classification | 74.24% (-) |

Shaikh et al. [48] | Dimensionality reduction and support vector machine | 97.91% (-) |

Nguyen et al. [15] | Feature selection and ensemble voting | 98.00% (10-CV) |

Alickovic et al. [49] | Normalized multi layer perceptron neural network | 99.27% (-) |

Osman et al. [25] | Ensemble learning using Radial Based Function Neural Network models (RBFNN) | 97.00% (10-CV) |

Kaushik et al. [50] | Ensemble learning via MLP, RF and RT | 93.50% (10-CV) |

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

