Toward Interpretable Cell Image Representation and Abnormality Scoring for Cervical Cancer Screening Using Pap Smears
Abstract
:1. Introduction
- We demonstrate that VAE models can learn interpretable features from pap-smear datasets;
- The trained models can detect abnormalities by estimating a Gaussian based on the latent feature space using only normal samples;
- Additional image augmentation during the training of generative models can enhance the distinction between normal and abnormal samples;
- The formulation of statistical distance based on cross-entropy enables agglomerative clustering that outperforms conventional clustering methods;
- Finally, our model can be generalized to other datasets containing images of cervical cells. It is capable of distinguishing normal and abnormal images by using the pretrained encoder.
2. Materials and Methods
2.1. VAEs and Disentanglement
2.2. Score Function
2.3. Interpreting Latent Space
2.4. Cross-Entropy-Based Referenced Statistical Distance
2.5. Dataset
2.6. Experiments
3. Results
3.1. Latent Space Captures Morphological and Color Characteristics of Cervical Pap-Smear Cells
3.2. Negative Log-Likelihood Is in Line with the Progressive Severity of Precancerous Stages
3.3. Characteristics of External Datasets Revealed by Distributions in Latent Space
3.4. Novel Statistical Pseudo-Distance Improves Clustering Performance
3.5. Toward Reduced Intraobserver Variability Using Retrieved Images
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Appendix A. Relationships among CRSD and Other Divergences
Appendix B. Model Architecture, Training Details, and Hardware
References
- Jansen, E.E.; Zielonke, N.; Gini, A.; Anttila, A.; Segnan, N.; Vokó, Z.; Ivanuš, U.; McKee, M.; de Koning, H.J.; de Kok, I.M.; et al. Effect of organised cervical cancer screening on cervical cancer mortality in Europe: A systematic review. Eur. J. Cancer 2020, 127, 207–223. [Google Scholar] [CrossRef] [PubMed]
- Bedell, S.L.; Goldstein, L.S.; Goldstein, A.R.; Goldstein, A.T. Cervical cancer screening: Past, present, and future. Sex. Med. Rev. 2020, 8, 28–37. [Google Scholar] [CrossRef] [PubMed]
- Sung, H.; Ferlay, J.; Siegel, R.L.; Laversanne, M.; Soerjomataram, I.; Jemal, A.; Bray, F. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA A Cancer J. Clin. 2021, 71, 209–249. [Google Scholar] [CrossRef] [PubMed]
- Solomon, D.; Davey, D.; Kurman, R.; Moriarty, A.; O’Connor, D.; Prey, M.; Raab, S.; Sherman, M.; Wilbur, D.; Wright, T., Jr.; et al. The 2001 Bethesda System: Terminology for reporting results of cervical cytology. JAMA 2002, 287, 2114–2119. [Google Scholar] [CrossRef] [PubMed]
- Kurtycz, D.F.; Staats, P.N.; Chute, D.J.; Russell, D.; Pavelec, D.; Monaco, S.E.; Friedlander, M.A.; Wilbur, D.C.; Nayar, R. Bethesda interobserver reproducibility Study-2 (BIRST-2): Bethesda system 2014. J. Am. Soc. Cytopathol. 2017, 6, 131–144. [Google Scholar] [CrossRef] [PubMed]
- Chitra, B.; Kumar, S. Recent advancement in cervical cancer diagnosis for automated screening: A detailed review. J. Ambient. Intell. Humaniz. Comput. 2022, 13, 251–269. [Google Scholar] [CrossRef]
- Wang, P.; Wang, L.; Li, Y.; Song, Q.; Lv, S.; Hu, X. Automatic cell nuclei segmentation and classification of cervical Pap smear images. Biomed. Signal Process. Control 2019, 48, 93–103. [Google Scholar] [CrossRef]
- Basak, H.; Kundu, R.; Chakraborty, S.; Das, N. Cervical cytology classification using PCA and GWO enhanced deep features selection. SN Comput. Sci. 2021, 2, 369. [Google Scholar] [CrossRef]
- Manna, A.; Kundu, R.; Kaplun, D.; Sinitca, A.; Sarkar, R. A fuzzy rank-based ensemble of CNN models for classification of cervical cytology. Sci. Rep. 2021, 11, 14538. [Google Scholar] [CrossRef] [PubMed]
- Özbay, E.; Özbay, F.A. Interpretable pap-smear image retrieval for cervical cancer detection with rotation invariance mask generation deep hashing. Comput. Biol. Med. 2023, 154, 106574. [Google Scholar] [CrossRef] [PubMed]
- Kingma, D.P.; Welling, M. Auto-encoding variational bayes. arXiv 2013, arXiv:1312.6114. [Google Scholar]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
- Goyal, A.; Bengio, Y. Inductive biases for deep learning of higher-level cognition. Proc. R. Soc. A 2022, 478, 20210068. [Google Scholar] [CrossRef]
- Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA, USA, 2016. [Google Scholar]
- Moya, M.M.; Koch, M.W.; Hostetler, L.D. One-class classifier networks for target recognition applications. NASA STI/Recon Tech. Rep. N 1993, 93, 24043. [Google Scholar]
- Khalid, H.; Woo, S.S. Oc-fakedect: Classifying deepfakes using one-class variational autoencoder. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, WA, USA, 14–19 June 2020; pp. 656–657. [Google Scholar]
- Zavrak, S.; İskefiyeli, M. Anomaly-Based Intrusion Detection From Network Flow Features Using Variational Autoencoder. IEEE Access 2020, 8, 108346–108358. [Google Scholar] [CrossRef]
- Bishop, C.M.; Nasrabadi, N.M. Pattern Recognition and Machine Learning; Springer: New York, NY, USA, 2006; Volume 4. [Google Scholar]
- Bezdek, J.C. Fuzzy-Mathematics in Pattern Classification; Cornell University: Ithaca, NY, USA, 1973. [Google Scholar]
- Sharma, M.; Singh, S.K.; Agrawal, P.; Madaan, V. Classification of clinical dataset of cervical cancer using KNN. Indian J. Sci. Technol. 2016, 9, 28. [Google Scholar] [CrossRef]
- Sulaiman, S.N.; Mat-Isa, N.A.; Othman, N.H.; Ahmad, F. Improvement of features extraction process and classification of cervical cancer for the neuralpap system. Procedia Comput. Sci. 2015, 60, 750–759. [Google Scholar] [CrossRef]
- Löffler, M.; Phillips, J.M. Shape fitting on point sets with probability distributions. In Proceedings of the Algorithms-ESA 2009: 17th Annual European Symposium, Copenhagen, Denmark, 7–9 September 2009; Proceedings 17. Springer: Berlin/Heidelberg, Germany, 2009; pp. 313–324. [Google Scholar]
- Lu, X. Information Mandala: Statistical Distance Matrix with Clustering. IEEE Access 2021, 9, 56563–56577. [Google Scholar] [CrossRef]
- Müllner, D. Modern hierarchical, agglomerative clustering algorithms. arXiv 2011, arXiv:1109.2378. [Google Scholar]
- Cover, T.M. Elements of Information Theory; John Wiley & Sons: New York, NY, USA, 1999. [Google Scholar]
- Burgess, C.P.; Higgins, I.; Pal, A.; Matthey, L.; Watters, N.; Desjardins, G.; Lerchner, A. Understanding disentangling in beta-VAE. arXiv 2018, arXiv:1804.03599. [Google Scholar]
- Bengio, Y.; Courville, A.; Vincent, P. Representation learning: A review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 2013, 35, 1798–1828. [Google Scholar] [CrossRef] [PubMed]
- Chen, R.T.; Li, X.; Grosse, R.B.; Duvenaud, D.K. Isolating sources of disentanglement in variational autoencoders. Adv. Neural Inf. Process. Syst. 2018, 31, 2615–2625. [Google Scholar]
- Watanabe, S. Information theoretical analysis of multivariate correlation. IBM J. Res. Dev. 1960, 4, 66–82. [Google Scholar] [CrossRef]
- Hoffman, M.D.; Johnson, M.J. Elbo surgery: Yet another way to carve up the variational evidence lower bound. In Proceedings of the Workshop in Advances in Approximate Bayesian Inference, NIPS, Barcelona, Spain, 5–11 December 2016; Volume 1. [Google Scholar]
- Rezende, M.T.; Silva, R.; Bernardo, F.d.O.; Tobias, A.H.; Oliveira, P.H.; Machado, T.M.; Costa, C.S.; Medeiros, F.N.; Ushizima, D.M.; Carneiro, C.M.; et al. Cric searchable image database as a public platform for conventional pap smear cytology data. Sci. Data 2021, 8, 151. [Google Scholar] [CrossRef] [PubMed]
- Martin, E.; Jantzen, J. Pap-Smear Classification; Technical University of Denmark: Lyngby, Denmark, 2003; pp. 1899–2227. [Google Scholar]
- Plissiti, M.E.; Dimitrakopoulos, P.; Sfikas, G.; Nikou, C.; Krikoni, O.; Charchanti, A. SIPAKMED: A new dataset for feature and image based classification of normal and pathological cervical cells in Pap smear images. In Proceedings of the 2018 25th IEEE International Conference on Image Processing (ICIP), Athens, Greece, 7–10 October 2018; pp. 3144–3148. [Google Scholar]
- Subramanian, A. PyTorch-VAE. 2020. Available online: https://github.com/AntixK/PyTorch-VAE (accessed on 30 January 2023).
- Ahmed, M.; Seraj, R.; Islam, S.M.S. The k-means algorithm: A comprehensive survey and performance evaluation. Electronics 2020, 9, 1295. [Google Scholar] [CrossRef]
- Ng, A.; Jordan, M.; Weiss, Y. On spectral clustering: Analysis and an algorithm. Adv. Neural Inf. Process. Syst. 2001, 14, 849–856. [Google Scholar]
- Ester, M. A Density-Based Algorithm for Discovering Clusters in Sarge Spatial Databases with Noise. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, Portland, OR, USA, 2–4 August 1996; pp. 291–316. [Google Scholar]
- Rosenberg, A.; Hirschberg, J. V-measure: A conditional entropy-based external cluster evaluation measure. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), Prague, Czech Republic, 28–30 June 2007; pp. 410–420. [Google Scholar]
- Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
- Jun, H.; Child, R.; Chen, M.; Schulman, J.; Ramesh, A.; Radford, A.; Sutskever, I. Distribution augmentation for generative modeling. In Proceedings of the International Conference on Machine Learning, PMLR, Virtual, 13–18 July 2020; pp. 5006–5019. [Google Scholar]
- Diniz, D.N.; Rezende, M.T.; Bianchi, A.G.; Carneiro, C.M.; Ushizima, D.M.; de Medeiros, F.N.; Souza, M.J. A hierarchical feature-based methodology to perform cervical cancer classification. Appl. Sci. 2021, 11, 4091. [Google Scholar] [CrossRef]
- Liu, W.; Li, C.; Xu, N.; Jiang, T.; Rahaman, M.M.; Sun, H.; Wu, X.; Hu, W.; Chen, H.; Sun, C.; et al. CVM-Cervix: A hybrid cervical Pap-smear image classification framework using CNN, visual transformer and multilayer perceptron. Pattern Recognit. 2022, 130, 108829. [Google Scholar] [CrossRef]
- Ioffe, S.; Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the International Conference on Machine Learning, PMLR, Lille, France, 6–11 July 2015; pp. 448–456. [Google Scholar]
- Maas, A.L.; Hannun, A.Y.; Ng, A.Y. Rectifier nonlinearities improve neural network acoustic models. In Proceedings of the 30th International Conference on Machine Learning, Atlanta, GA, USA, 17–19 June 2013; Volume 30, p. 3. [Google Scholar]
- Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
- Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32; Curran Associates, Inc.: Red Hook, NY, USA, 2019; pp. 8024–8035. [Google Scholar]
Dataset | Class | Counts (Total) |
---|---|---|
CRIC | Negative intraepithelial lesion or malignancy (NILM) | 5422 |
ASC-US | 563 | |
LSIL | 1287 | |
ASC-H | 894 | |
HSIL | 1609 | |
SCC | 156 | |
SIPAKMED | Superficial–Intermediate | 831 |
Parabasal | 782 | |
Koilocytotic | 814 | |
Dyskeratotic | 794 | |
Metaplastic | 785 |
Abnormal Class | Accuracy (±Standard Deviation (std)) | Area under Receiver Operating Characteristic Curve (AUROC) (±std) | F1 Score (±std) | Sensitivity (±std) | Specificity (±std) |
---|---|---|---|---|---|
SCC | 0.806 ± 0.029 | 0.908 ± 0.003 | 0.883 ± 0.020 | 0.800 ± 0.035 | 0.870 ± 0.036 |
HSIL | 0.849 ± 0.002 | 0.920 ± 0.002 | 0.850 ± 0.006 | 0.829 ± 0.027 | 0.871 ± 0.026 |
ASC-H | 0.742 ± 0.005 | 0.808 ± 0.006 | 0.787 ± 0.007 | 0.731 ± 0.018 | 0.761 ± 0.020 |
LSIL | 0.647 ± 0.006 | 0.689 ± 0.005 | 0.684 ± 0.018 | 0.672 ± 0.044 | 0.613 ± 0.043 |
ASC-US | 0.449 ± 0.030 | 0.549 ± 0.013 | 0.492 ± 0.055 | 0.358 ± 0.058 | 0.724 ± 0.055 |
Algorithm | Homogeneity (h) | Completeness (c) | V-Measure (V) |
---|---|---|---|
Agg-SM | 0.393 ± 0.0117 | 0.151 ± 0.00259 | 0.218 ± 0.005 |
Agg-EM | 0.282 ± 0.00591 | 0.0940 ± 0.00174 | 0.141 ± 0.00265 |
K-Means | 0.275 ± 0.006 | 0.725 ± 0.0019 | 0.114 ± 0.003 |
Spectral clustering | 0.0526 ± 0.0031 | 0.0950 ± 0.0044 | 0.0676 ± 0.0025 |
DBSCAN | 0.000234 ± 0.000413 | 0.746 ± 0.414 | 0.000463 ± 0.000818 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ando, Y.; Cho, J.; Park, N.J.-Y.; Ko, S.; Han, H. Toward Interpretable Cell Image Representation and Abnormality Scoring for Cervical Cancer Screening Using Pap Smears. Bioengineering 2024, 11, 567. https://doi.org/10.3390/bioengineering11060567
Ando Y, Cho J, Park NJ-Y, Ko S, Han H. Toward Interpretable Cell Image Representation and Abnormality Scoring for Cervical Cancer Screening Using Pap Smears. Bioengineering. 2024; 11(6):567. https://doi.org/10.3390/bioengineering11060567
Chicago/Turabian StyleAndo, Yu, Junghwan Cho, Nora Jee-Young Park, Seokhwan Ko, and Hyungsoo Han. 2024. "Toward Interpretable Cell Image Representation and Abnormality Scoring for Cervical Cancer Screening Using Pap Smears" Bioengineering 11, no. 6: 567. https://doi.org/10.3390/bioengineering11060567
APA StyleAndo, Y., Cho, J., Park, N. J. -Y., Ko, S., & Han, H. (2024). Toward Interpretable Cell Image Representation and Abnormality Scoring for Cervical Cancer Screening Using Pap Smears. Bioengineering, 11(6), 567. https://doi.org/10.3390/bioengineering11060567