# Persistent Homology-Based Machine Learning Method for Filtering and Classifying Mammographic Microcalcification Images in Early Cancer Detection

^{1}

^{2}

^{3}

^{4}

^{5}

^{*}

## Abstract

**:**

## Simple Summary

## Abstract

## 1. Introduction

## 2. Materials and Methods

#### 2.1. Dataset

#### 2.2. Persistent Homology

_{0}) representing connected components, 1-dimensional holes (H

_{1}) representing loops, and 2-dimensional holes (H

_{2}) representing voids.

#### 2.2.1. Interpreting the Persistent Diagrams

_{0}(connected complete black pixels) and 1-dimensional holes denoted as H

_{1}or loops (complete white pixels inside components made of black pixels). However, some dimensions may be irrelevant to the study [31]. As microcalcifications appear as white spots, this study employs only H

_{1}. Figure 5 illustrates samples of PD for H

_{1}obtained from benign and malignant microcalcifications in different datasets.

- Step 1:
- Obtain PDs for 1-dimensional holes (H
_{1}); - Step 2:
- Calculate the lifespan for each point in the PD;
- Step 3:
- Find the maximum lifespan;
- Step 4:
- Filter.

_{1}for H

_{1}. Hence, the filtering process will reduce the number of B

_{1}values. Figure 6 demonstrates a schematic representation of a multi-level filtering PD in a sample of a malignant image.

_{1}reduced from 1341 points initially to 510, 162, 68, 32, and 18, respectively. Subsequently, the original and filtered PD is then used to obtain the feature vector representations, which can be easily incorporated as input in machine learning models.

#### 2.2.2. Vectorised Topological Features

- Persistent Entropy (PE)

- Persistent Image (PI)

#### 2.3. Machine Learning Classifier

- Neural network (NN);
- Support vector machine (SVM);
- K-nearest neighbour (KNN);
- Decision tree (DT).

#### 2.4. Performance Evaluation

- Confusion matrix: Provide a matrix as output that describes the method’s performance consisting of the total number of correct and incorrect predictions. The matrix is shown in Figure 7.
- Classification Accuracy (CA): The percentage of microcalcifications correctly classified to the total number of observations. It can be measured as follows:$$CA=\frac{TP+TN}{TP+TN+FP+FN}.$$
- Area under the Curve (AUC): The AUC can be measured by calculating the area under the receiver operating characteristic (ROC) curve. The ROC curve is a plot of the true positive rate (TPR), called sensitivity or recall, versus the false positive rate (FPR). TPR is defined in this context as the number of correctly diagnosed malignant cases divided by the total number of malignant cases. In contrast, FPR is defined as the number of benign cases wrongly classified as malignant divided by the total number of benign instances. TPR and FPR can be calculated using Equations (4) and (5), respectively:$$TPR=\frac{TP}{TN+FN}.$$$$FPR=\frac{FP}{FP+TN}.$$

#### 2.5. Implementation Details

- Neural network (NN): classifier type = medium, the number of fully connected layers = 1, the first layer size = 25, and the activation function = ReLu.
- Support vector machine (SVM): kernel type = linear, kernel scale = automatic, and box constraint level = 1
- K-nearest neighbour (KNN): classifier type = fine, number of neighbours = 1, distance metric = Euclidean, and distance weight = equal.
- Decision tree (DT): classifier type = fine tree, the maximum number of splits = 100, and split criterion = Gini’s diversity index.

## 3. Results

#### 3.1. Topological Filtering

#### 3.2. Classification Performance

#### 3.3. Performance of Machine Learning Models

## 4. Discussion and Future Work

#### Comparative Analysis

## 5. Conclusions

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## References

- Ferlay, J.; Colombet, M.; Soerjomataram, I.; Parkin, D.M.; Piñeros, M.; Znaor, A.; Bray, F. Cancer Statistics for the Year 2020: An Overview. Int. J. Cancer
**2021**, 149, 778–789. [Google Scholar] [CrossRef] [PubMed] - Vy, V.P.T.; Yao, M.M.-S.; Le, N.Q.K.; Chan, W.P. Machine Learning Algorithm for Distinguishing Ductal Carcinoma in Situ from Invasive Breast Cancer. Cancers
**2022**, 14, 2437. [Google Scholar] [CrossRef] [PubMed] - Ramadan, S.Z. Methods Used in Computer-Aided Diagnosis for Breast Cancer Detection Using Mammograms: A Review. J. Healthc. Eng.
**2020**, 2020, 9162464. [Google Scholar] [CrossRef] - Htay, M.N.N.; Donnelly, M.; Schliemann, D.; Loh, S.Y.; Dahlui, M.; Somasundaram, S.; Tamin, N.S.B.I.; Su, T.T. Breast Cancer Screening in Malaysia: A Policy Review. Asian Pac. J. Cancer Prev.
**2021**, 22, 1685. [Google Scholar] [CrossRef] [PubMed] - Melekoodappattu, J.G.; Subbian, P.S.; Queen, M.P.F. Detection and Classification of Breast Cancer from Digital Mammograms Using Hybrid Extreme Learning Machine Classifier. Int. J. Imaging Syst. Technol.
**2021**, 31, 909–920. [Google Scholar] [CrossRef] - Oliver, A.; Torrent, A.; Lladó, X.; Tortajada, M.; Tortajada, L.; Sentís, M.; Freixenet, J.; Zwiggelaar, R. Automatic Microcalcification and Cluster Detection for Digital and Digitised Mammograms. Knowl. Based Syst.
**2012**, 28, 68–75. [Google Scholar] [CrossRef] - Suckling, J. The Mammographic Image Analysis Society Digital Mammogram Database. Exerpta Med. Int. Congr.
**1994**, 1069, 375–386. [Google Scholar] - Azam, A.S.B.; Malek, A.A.; Ramlee, A.S.; Suhaimi, N.D.S.M.; Mohamed, N. Segmentation of Breast Microcalcification Using Hybrid Method of Canny Algorithm with Otsu Thresholding and 2D Wavelet Transform. In Proceedings of the 2020 10th IEEE International Conference on Control System, Computing and Engineering (ICCSCE), Penang, Malaysia, 21–22 August 2020; pp. 91–96. [Google Scholar]
- Dabass, J.; Arora, S.; Vig, R.; Hanmandlu, M. Segmentation Techniques for Breast Cancer Imaging Modalities-A Review. In Proceedings of the 2019 9th International Conference on Cloud Computing, Data Science & Engineering (Confluence), Noida, India, 10–11 January 2019; pp. 658–663. [Google Scholar]
- Banumathy, D.; Khalaf, O.I.; Romero, C.A.T.; Raja, P.V.; Sharma, D.K. Breast Calcifications and Histopathological Analysis on Tumour Detection by CNN. Comput. Syst. Sci. Eng.
**2023**, 44, 595–612. [Google Scholar] [CrossRef] - Roty, S.; Wiratkapun, C.; Tanawongsuwan, R.; Phongsuphap, S. Analysis of Microcalcification Features for Pathological Classification of Mammograms. In Proceedings of the 2017 10th Biomedical Engineering International Conference (BMEiCON), Hokkaido, Japan, 31 August–2 September 2017; pp. 1–5. [Google Scholar]
- Fan, L.; Zhang, F.; Fan, H.; Zhang, C. Brief Review of Image Denoising Techniques. Vis. Comput. Ind. Biomed. Art
**2019**, 2, 7. [Google Scholar] [CrossRef] - Krishnan, A. An Overview of Mammogram Noise and Denoising Techniques. Int. J. Eng. Res. Gen. Sci.
**2016**, 4, 557–563. [Google Scholar] - Patil, R.S.; Biradar, N. Automated Mammogram Breast Cancer Detection Using the Optimized Combination of Convolutional and Recurrent Neural Network. Evol. Intell.
**2020**, 14, 1459–1474. [Google Scholar] [CrossRef] - Fadil, R.; Jackson, A.; El Majd, B.A.; El Ghazi, H.; Kaabouch, N. Classification of Microcalcifications in Mammograms Using 2D Discrete Wavelet Transform and Random Forest. In Proceedings of the 2020 IEEE International Conference on Electro Information Technology (EIT), Chicago, IL, USA, 31 July–1 August 2020; pp. 353–359. [Google Scholar]
- Mahmood, T.; Li, J.; Pei, Y.; Akhtar, F.; Imran, A.; Yaqub, M. An Automatic Detection and Localization of Mammographic Microcalcifications ROI with Multi-Scale Features Using the Radiomics Analysis Approach. Cancers
**2021**, 13, 5916. [Google Scholar] [CrossRef] [PubMed] - Gowri, V.; Valluvan, K.R.; Vijaya Chamundeeswari, V. Automated Detection and Classification of Microcalcification Clusters with Enhanced Preprocessing and Fractal Analysis. Asian Pac. J. Cancer Prev.
**2018**, 19, 3093–3098. [Google Scholar] [CrossRef] [PubMed] - Pun, C.S.; Lee, S.X.; Xia, K. Persistent-Homology-Based Machine Learning: A Survey and a Comparative Study. Artif. Intell. Rev.
**2022**, 55, 5169–5213. [Google Scholar] [CrossRef] - Choe, S.; Ramanna, S. Cubical Homology-Based Machine Learning: An Application in Image Classification. Axioms
**2022**, 11, 112. [Google Scholar] [CrossRef] - Asaad, A.; Ali, D.; Majeed, T.; Rashid, R. Persistent Homology for Breast Tumor Classification Using Mammogram Scans. Mathematics
**2022**, 10, 4039. [Google Scholar] [CrossRef] - Kusano, G.; Fukumizu, K.; Hiraoka, Y. Kernel Method for Persistence Diagrams via Kernel Embedding and Weight Factor. J. Mach. Learn. Res.
**2018**, 18, 1–41. [Google Scholar] - Moroni, D.; Pascali, M.A. Learning Topology: Bridging Computational Topology and Machine Learning. Pattern Recognit. Image Anal.
**2021**, 31, 443–453. [Google Scholar] [CrossRef] - Avilés-Rodríguez, G.J.; Nieto-Hipólito, J.I.; Cosío-León, M.D.L.Á.; Romo-Cárdenas, G.S.; Sánchez-López, J.D.D.; Radilla-Chávez, P.; Vázquez-Briseño, M. Topological Data Analysis for Eye Fundus Image Quality Assessment. Diagnostics
**2021**, 11, 1322. [Google Scholar] [CrossRef] - Adams, H.; Emerson, T.; Kirby, M.; Neville, R.; Peterson, C.; Shipman, P.; Chepushtanova, S.; Hanson, E.; Motta, F.; Ziegelmeier, L. Persistence Images: A Stable Vector Representation of Persistent Homology. J. Mach. Learn. Res.
**2017**, 18, 1–35. [Google Scholar] - Teramoto, T.; Shinohara, T.; Takiyama, A. Computer-Aided Classification of Hepatocellular Ballooning in Liver Biopsies from Patients with NASH Using Persistent Homology. Comput. Methods Programs Biomed.
**2020**, 195, 105614. [Google Scholar] [CrossRef] [PubMed] - Oyama, A.; Hiraoka, Y.; Obayashi, I.; Saikawa, Y.; Furui, S.; Shiraishi, K.; Kumagai, S.; Hayashi, T.; Kotoku, J. Hepatic Tumor Classification Using Texture and Topology Analysis of Non-Contrast-Enhanced Three-Dimensional T1-Weighted MR Images with a Radiomics Approach. Sci. Rep.
**2019**, 9, 8764. [Google Scholar] [CrossRef] [PubMed] - Leykam, D.; Rondón, I.; Angelakis, D.G. Dark Soliton Detection Using Persistent Homology. Chaos Interdiscip. J. Nonlinear Sci.
**2022**, 32, 73133. [Google Scholar] [CrossRef] - Edwards, P.; Skruber, K.; Milićević, N.; Heidings, J.B.; Read, T.A.; Bubenik, P.; Vitriol, E.A. TDAExplore: Quantitative Analysis of Fluorescence Microscopy Images through Topology-Based Machine Learning. Patterns
**2021**, 2, 100367. [Google Scholar] [CrossRef] - Nishio, M.; Nishio, M.; Jimbo, N.; Nakane, K. Homology-Based Image Processing for Automatic Classification of Histopathological Images of Lung Tissue. Cancers
**2021**, 13, 1192. [Google Scholar] [CrossRef] - Rammal, A.; Assaf, R.; Goupil, A.; Kacim, M.; Vrabie, V. Machine Learning Techniques on Homological Persistence Features for Prostate Cancer Diagnosis. BMC Bioinform.
**2022**, 23, 476. [Google Scholar] [CrossRef] - Conti, F.; Moroni, D.; Pascali, M.A. A Topological Machine Learning Pipeline for Classification. Mathematics
**2022**, 10, 3086. [Google Scholar] [CrossRef] - Heath, M.; Bowyer, K.; Kopans, D.; Moore, R.; Kegelmeyer, P. The Digital Database for Screening Mammography. In Proceedings of the Fifth International Workshop on Digital Mammography, Toronto, ON, Canada, 20–23 July 2000; pp. 212–218. [Google Scholar]
- Beksi, W.J.; Papanikolopoulos, N. 3D Region Segmentation Using Topological Persistence. In Proceedings of the 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Daejon, Korea, 9–14 October 2016; pp. 1079–1084. [Google Scholar]
- Otter, N.; Porter, M.A.; Tillmann, U.; Grindrod, P.; Harrington, H.A. A Roadmap for the Computation of Persistent Homology. EPJ Data Sci.
**2017**, 6, 1–38. [Google Scholar] [CrossRef] - Kramár, M.; Levanger, R.; Tithof, J.; Suri, B.; Xu, M.; Paul, M.; Schatz, M.F.; Mischaikow, K. Analysis of Kolmogorov Flow and Rayleigh–Bénard Convection Using Persistent Homology. Physica D
**2016**, 334, 82–98. [Google Scholar] [CrossRef] - Garin, A.; Tauzin, G. A Topological “reading” Lesson: Classification of MNIST Using TDA. In Proceedings of the 2019 18th IEEE International Conference on Machine Learning and Applications, ICMLA 2019, Boca Raton, FL, USA, 16–19 December 2019; pp. 1551–1556. [Google Scholar]
- Pun, C.S.; Xia, K.; Lee, S.X. Persistent-Homology-Based Machine Learning and Its Applications—A Survey. arXiv
**2018**, arXiv:1811.00252. [Google Scholar] [CrossRef] - Chazal, F.; Michel, B. An Introduction to Topological Data Analysis: Fundamental and Practical Aspects for Data Scientists. Front. Artif. Intell.
**2021**, 4, 1–28. [Google Scholar] [CrossRef] [PubMed] - Atienza, N.; Gonzalez-Díaz, R.; Soriano-Trigueros, M. On the Stability of Persistent Entropy and New Summary Functions for Topological Data Analysis. Pattern Recognit.
**2020**, 107, 107509. [Google Scholar] [CrossRef] - Moon, C.; Li, Q.; Xiao, G. Using Persistent Homology Topological Features to Characterize Medical Images: Case Studies on Lung and Brain Cancers. arXiv
**2020**, arXiv:2012.12102. [Google Scholar] - Jiao, Y.; Du, P. Performance Measures in Evaluating Machine Learning Based Bioinformatics Predictors for Classifications. Quant. Biol.
**2016**, 4, 320–330. [Google Scholar] [CrossRef] - Kaji, S.; Sudo, T.; Ahara, K. Cubical Ripser: Software for Computing Persistent Homology of Image and Volume Data. arXiv
**2020**, arXiv:2005.12692. [Google Scholar] - Turkes, R.; Nys, J.; Verdonck, T.; Latre, S. Noise Robustness of Persistent Homology on Greyscale Images, across Filtrations and Signatures. PLoS ONE
**2021**, 16, e0257215. [Google Scholar] [CrossRef] - Sakka, E.; Prentza, A.; Koutsouris, D. Classification Algorithms for Microcalcifications in Mammograms (Review). Oncol. Rep.
**2006**, 15, 1049–1055. [Google Scholar] [CrossRef] - Temmermans, F.; Jansen, B.; Willekens, I.; Van de Casteele, E.; Deklerck, R.; Schelkens, P.; De Mey, J. Classification of Microcalcifications Using Micro-CT. In Proceedings of the Applications of Digital Image Processing XXXVI, San Diego, CA, USA, 26–29 August 2013; Tescher, A.G., Ed.; SPIE: Bellingham, WA, USA, 2013; Volume 8856. [Google Scholar]
- Suhail, Z.; Denton, E.R.E.; Zwiggelaar, R. Classification of Micro-Calcification in Mammograms Using Scalable Linear Fisher Discriminant Analysis. Med. Biol. Eng. Comput.
**2018**, 56, 1475–1485. [Google Scholar] [CrossRef] - Chen, Z.; Strange, H.; Oliver, A.; Denton, E.R.E.; Boggis, C.; Zwiggelaar, R. Topological Modeling and Classification of Mammographic Microcalcification Clusters. IEEE Trans. Biomed. Eng.
**2015**, 62, 1203–1214. [Google Scholar] [CrossRef] - Strange, H.; Chen, Z.; Denton, E.R.E.; Zwiggelaar, R. Modelling Mammographic Microcalcification Clusters Using Persistent Mereotopology. Pattern Recognit. Lett.
**2014**, 47, 157–163. [Google Scholar] [CrossRef]

**Figure 1.**Sample microcalcification on mammogram images in the MIAS dataset [7]. (

**a**) Benign microcalcifications and (

**b**) malignant microcalcifications.

**Figure 4.**Example of the cubical complex in a greyscale image. (

**a**) Image pixel values; (

**b**) the filtered cubical complex; (

**c**) the lifespan of connected components and loops based on (

**b**); and (

**d**) the persistent diagram based on (

**c**).

**Figure 6.**Example of a multi-level filtering PD with Betti number (B

_{1}). (

**a**) Original PD with max lifespan (red circle), B

_{1}= 1341; (

**b**) 10% of data filtered out, B

_{1}= 510; (

**c**) 20% of data filtered out, B

_{1}= 162; (

**d**) 30% of data filtered out, B

_{1}= 68; (

**e**) 40% of data filtered out, B

_{1}= 32; and (

**f**) 50% of data filtered out, B

_{1}= 18.

**Figure 8.**Scatter plot of PE versus a PI with different levels of filters in the MIAS dataset [7]. (

**a**) Non-filter; (

**b**) filter 10% of max lifespan; (

**c**) filter 20% of max lifespan; (

**d**) filter 30% of max lifespan; (

**e**) filter 40% of max lifespan; and (

**f**) filter 50% of max lifespan.

**Figure 9.**Scatter plot of persistent entropy versus a persistent image with different levels of filters in the DDSM dataset [32]. (

**a**) Non-filter; (

**b**) filter 10% of max lifespan; (

**c**) filter 20% of max lifespan; (

**d**) filter 30% of max lifespan; (

**e**) filter 40% of max lifespan; and (

**f**) filter 50% of max lifespan.

**Figure 10.**Performance comparison of machine learning classifiers based on the optimal filter (20%) in the MIAS dataset. TC = true class and PC = predicted class. (

**a**) Accuracy and AUC of classifiers. (

**b**) Confusion matrix of concatenating features.

**Figure 11.**Performance comparison of machine learning classifiers based on the optimal filter (30%) in the DDSM dataset. (

**a**) Accuracy and AUC of classifiers. (

**b**) Confusion matrix of concatenating features.

**Figure 12.**Discriminant feature values based on the Decision Tree model. (

**a**) MIAS Dataset; (

**b**) DDSM Dataset.

**Table 1.**Classification performance of the PI feature in the MIAS dataset. CA = accuracy and Std = standard deviation.

Pers. Image Condition | NN | KNN | SVM | DT | ||||
---|---|---|---|---|---|---|---|---|

CA ± Std | AUC | CA ± Std | AUC | CA ± Std | AUC | CA ± Std | AUC | |

Non-Filter | 76.9 ± 8.3 | 0.76 | 73.1 ± 8.7 | 0.73 | 69.2 ± 9.1 | 0.85 | 73.1 ± 8.7 | 0.74 |

Filter 10% | 80.8 ± 7.7 | 0.91 | 73.1 ± 8.7 | 0.73 | 73.1 ± 8.7 | 0.86 | 73.1 ± 8.7 | 0.74 |

Filter 20% | 92.3 ± 5.2 | 0.93 | 88.5 ± 6.3 | 0.88 | 76.9 ± 8.3 | 0.93 | 92.3 ± 5.2 | 0.9 |

Filter 30% | 76.9 ± 8.3 | 0.89 | 76.9 ± 8.3 | 0.77 | 73.1 ± 8.7 | 0.89 | 73.1 ± 8.7 | 0.74 |

Filter 40% | 73.1 ± 8.7 | 0.72 | 73.1 ± 8.7 | 0.73 | 73.1 ± 8.7 | 0.9 | 76.9 ± 8.3 | 0.73 |

Filter 50% | 69.2 ± 9.1 | 0.69 | 69.2 ± 9.1 | 0.69 | 73.1 ± 8.7 | 0.85 | 73.1 ± 8.7 | 0.68 |

Pers. Image Condition | NN | KNN | SVM | DT | ||||
---|---|---|---|---|---|---|---|---|

CA ± Std | AUC | CA ± Std | AUC | CA ± Std | AUC | CA ± Std | AUC | |

Non-Filter | 71.4 ± 3.8 | 0.74 | 50 ± 4.2 | 0.5 | 59.3 ± 4.2 | 0.69 | 63.6 ± 4.1 | 0.68 |

Filter 10% | 86.4 ± 2.9 | 0.93 | 82.9 ± 3.2 | 0.83 | 82.9 ± 3.2 | 0.83 | 85.7 ± 3 | 0.92 |

Filter 20% | 90.7 ± 2.5 | 0.95 | 86.4 ± 2.9 | 0.86 | 92.9 ± 2.2 | 0.94 | 90 ± 2.5 | 0.93 |

Filter 30% | 93.6 ± 2.1 | 0.94 | 93.6 ± 2.1 | 0.94 | 93.6 ± 2.1 | 0.94 | 94.3 ± 1.9 | 0.95 |

Filter 40% | 91.4 ± 2.4 | 0.97 | 87.1 ± 2.8 | 0.87 | 89.3 ± 2.6 | 0.97 | 87.1 ± 2.8 | 0.85 |

Filter 50% | 81.4 ± 3.3 | 0.91 | 85 ± 3 | 0.85 | 82.9 ± 3.2 | 0.93 | 83.6 ± 3.1 | 0.88 |

Pers. Entropy Condition | NN | KNN | SVM | DT | ||||
---|---|---|---|---|---|---|---|---|

CA ± Std | AUC | CA ± Std | AUC | CA ± Std | AUC | CA ± Std | AUC | |

Non-Filter | 53.8 ± 9.8 | 0.55 | 50 ± 9.8 | 0.5 | 69.2 ± 9.1 | 0.78 | 46.2 ± 9.8 | 0.59 |

Filter 10% | 57.7 ± 9.7 | 0.7 | 65.4 ± 9.3 | 0.65 | 65.4 ± 9.3 | 0.79 | 65.4 ± 9.3 | 0.69 |

Filter 20% | 84.6 ± 7.1 | 0.9 | 92.3 ± 5.2 | 0.92 | 80.8 ± 7.7 | 0.89 | 73.1 ± 8.7 | 0.75 |

Filter 30% | 80.8 ± 7.7 | 0.83 | 80.8 ± 7.7 | 0.81 | 80.8 ± 7.7 | 0.89 | 80.8 ± 7.7 | 0.8 |

Filter 40% | 76.9 ± 8.3 | 0.81 | 80.8 ± 7.7 | 0.81 | 76.9 ± 8.3 | 0.92 | 84.6 ± 7.1 | 0.79 |

Filter 50% | 76.9 ± 8.3 | 0.79 | 73.1 ± 8.7 | 0.73 | 80.8 ± 7.7 | 0.89 | 88.5 ± 6.3 | 0.83 |

Pers. Entropy Condition | NN | KNN | SVM | Decision Tree | ||||
---|---|---|---|---|---|---|---|---|

CA ± Std | AUC | CA ± Std | AUC | CA ± Std | AUC | CA ± Std | AUC | |

Non-Filter | 50.7 ± 4.2 | 0.55 | 50 ± 4.2 | 0.5 | 49.3 ± 4.2 | 0.53 | 55.7 ± 4.2 | 0.54 |

Filter 10% | 87.9 ± 2.8 | 0.92 | 79.3 ± 3.4 | 0.79 | 87.9 ± 2.8 | 0.94 | 80.7 ± 3.3 | 0.84 |

Filter 20% | 92.1 ± 2.3 | 0.96 | 91.4 ± 2.4 | 0.94 | 93.6 ± 2.1 | 0.97 | 90.7 ± 2.5 | 0.95 |

Filter 30% | 95 ± 1.8 | 0.97 | 96.4 ± 1.6 | 0.98 | 95.7 ± 1.7 | 0.99 | 92.9 ± 2.2 | 0.94 |

Filter 40% | 92.1 ± 2.3 | 0.97 | 89.3 ± 2.6 | 0.89 | 94.3 ± 1.9 | 0.97 | 90.7 ± 2.5 | 0.89 |

Filter 50% | 85.7 ± 2.9 | 0.93 | 82.9 ± 3.2 | 0.83 | 87.9 ± 2.8 | 0.97 | 85.7 ± 3 | 0.91 |

Concatenate Features | NN | KNN | SVM | DT | ||||
---|---|---|---|---|---|---|---|---|

CA ± Std | AUC | CA ± Std | AUC | CA ± Std | AUC | CA ± Std | AUC | |

No Filter | 61.5 ± 9.5 | 0.6 | 53.8 ± 9.8 | 0.54 | 69.2 ± 9.1 | 0.82 | 69.2 ± 9.1 | 0.68 |

Filter 10% | 61.5 ± 9.5 | 0.67 | 53.8 ± 9.8 | 0.54 | 69.2 ± 9.1 | 0.79 | 73.1 ± 8.7 | 0.74 |

Filter 20% | 96.2 ± 3.7 | 0.96 | 92.3 ± 5.2 | 0.92 | 88.5 ± 6.3 | 0.93 | 92.3 ± 5.2 | 0.91 |

Filter 30% | 84.6 ± 7.1 | 0.84 | 80.8 ± 7.7 | 0.81 | 80.8 ± 7.7 | 0.89 | 84.6 ± 7.1 | 0.82 |

Filter 40% | 84.6 ± 7.1 | 0.83 | 80.8 ± 7.7 | 0.81 | 80.8 ± 7.7 | 0.89 | 84.6 ± 7.1 | 0.77 |

Filter 50% | 76.9 ± 8.3 | 0.81 | 80.8 ± 7.7 | 0.81 | 80.8 ± 7.7 | 0.89 | 84.6 ± 7.1 | 0.8 |

Concatenate Features | NN | KNN | SVM | DT | ||||
---|---|---|---|---|---|---|---|---|

CA ± Std | AUC | CA ± Std | AUC | CA ± Std | AUC | CA ± Std | AUC | |

No Filter | 71.4 ± 3.8 | 0.75 | 73.6 ± 3.7 | 0.79 | 75 ± 3.7 | 0.81 | 68.6 ± 3.9 | 0.73 |

Filter 10% | 90 ± 2.5 | 0.95 | 90 ± 2.5 | 0.9 | 91.4 ± 2.4 | 0.96 | 90 ± 2.5 | 0.91 |

Filter 20% | 94.3 ± 1.9 | 0.95 | 94.3 ± 1.9 | 0.94 | 97.1 ± 1.4 | 0.99 | 92.9 ± 2.2 | 0.93 |

Filter 30% | 97.9 ± 1.2 | 0.98 | 99.3 ± 0.7 | 0.99 | 98.6 ± 1 | 0.99 | 97.9 ± 1.2 | 0.98 |

Filter 40% | 92.9 ± 2.2 | 0.95 | 92.1 ± 2.3 | 0.91 | 95 ± 1.8 | 0.99 | 96.4 ± 1.6 | 0.95 |

Filter 50% | 91.4 ± 2.4 | 0.94 | 87.9 ± 2.8 | 0.88 | 87.9 ± 2.8 | 0.97 | 87.9 ± 2.7 | 0.9 |

Method | Features | Image Type | Dataset | Classifier | Result |
---|---|---|---|---|---|

Fadil et al. [15] | Texture (GLCM) | Greyscale | DDSM | DWT-RF | CA = 95%, AUC = 0.92 |

Suhail et al. [46] | Local Features | Binary | DDSM | LDA-SVM | CA = 96%, AUC = 0.95 |

Mahmood et al. [16] | Textural and Statistical | Binary | MIAS | Radiomic-SVM | CA = 98%, AUC = 0.90 |

Gowri et al. [17] | Textural with Fractal Analysis | Binary | MIAS | NN | CA = 96.3%, AUC = NA |

Melekoodappattu et al. [5] | SURF, Gabor, and GLCM | Greyscale | MIAS | GSO-ELM-FOA | CA = 99.15%, AUC = NA |

Chen et al. [47] | Multiscale Morphology Graph | Binary | MIAS | KNN | CA = 95%, AUC = 0.96 |

DDSM | KNN | CA = 85.2%, AUC = 0.90 | |||

Strange et al. [48] | Mereotopological Barcode | Binary | MIAS | KNN | CA = 95%, AUC = 0.96 |

DDSM | KNN | CA = 80%, AUC = 0.82 | |||

Proposed Approach | PI and PE Features | Greyscale | MIAS | NN | CA = 96.2%, AUC = 0.96 |

DDSM | KNN | CA = 99.3%, AUC = 0.99 |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Malek, A.A.; Alias, M.A.; Razak, F.A.; Noorani, M.S.M.; Mahmud, R.; Zulkepli, N.F.S.
Persistent Homology-Based Machine Learning Method for Filtering and Classifying Mammographic Microcalcification Images in Early Cancer Detection. *Cancers* **2023**, *15*, 2606.
https://doi.org/10.3390/cancers15092606

**AMA Style**

Malek AA, Alias MA, Razak FA, Noorani MSM, Mahmud R, Zulkepli NFS.
Persistent Homology-Based Machine Learning Method for Filtering and Classifying Mammographic Microcalcification Images in Early Cancer Detection. *Cancers*. 2023; 15(9):2606.
https://doi.org/10.3390/cancers15092606

**Chicago/Turabian Style**

Malek, Aminah Abdul, Mohd Almie Alias, Fatimah Abdul Razak, Mohd Salmi Md Noorani, Rozi Mahmud, and Nur Fariha Syaqina Zulkepli.
2023. "Persistent Homology-Based Machine Learning Method for Filtering and Classifying Mammographic Microcalcification Images in Early Cancer Detection" *Cancers* 15, no. 9: 2606.
https://doi.org/10.3390/cancers15092606