# A Strictly Unsupervised Deep Learning Method for HEp-2 Cell Image Classification

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Proposed Clustering Method

#### 2.1. Feature Learning with the Convolutional Autoencoder

**x**, with $x\in {\mathcal{R}}^{D}$, the encoder takes it and transforms it into a contracted representation

**y**, with $y\in {\mathcal{R}}^{d}$, with $d<D$, by utilizing the transformation function $g$ in such a way that

**x**into

**y**using Equation (1), the decoder takes the contracted representation

**y**as input and uses the same transformation function $g$ but, in this time, for the purpose of reconstructing the original signal

**x**. Here, let

**z**be the output of the decoder. Then, we have

**′**englobes all the different parameters of the decoder, which can also be a set of weights and biases. Finally, the network, composed of both the encoder and the decoder, should learn the parameters $\mathsf{\theta}$ (encoder) and $\mathsf{\theta}$

**′**(decoder) in such a way that the reconstructed signal

**z**equals the input vector signal

**x**. This means that the network should learn the parameters that minimize as much as possible the existing differences between the input

**x**and the final network’s output

**z**.

**z**denotes the reconstruction (decoder’s output),

**x**represents the original image (encoder’s input), and the function $L\left(\xb7\right)$ represents the cost function that measures the differences between

**x**and

**z**. In this work, the adopted cost function is the squared Euclidean distance described as

**′**by minimizing the cost function represented in Equation (4).

#### 2.2. Embedded Clustering Layer for the Convolutional Autoencoder

**C**is a M$\times $k matrix, with M being the dimension of the input ${x}_{i}$ and k representing the number of the clusters. The matrix

**C**contains the clusters’ centroids that must be learned. Note that the centroids have the same dimension as the input data. Minimizing Equation (6) can be thought of as solving the following problem:

**x**in Equations (6) and (7) is precisely the latent representations (features) learned by the DCAE, as opposed in [34], where the input

**x**represents the output of the CNN. This means that, at iteration t of the training process, the clusters’ centroids contained in the matrix

**C**and the clusters’ assignments are updated according to the latent representations

**x**produced by the DCAE at the same iteration.

#### 2.3. Reconstruction Consistency

## 3. Results

#### 3.1. Experimental Setup and Dataset

#### 3.2. Results for Case-1

_{1}and PC

_{2}are, respectively, the first and second axis of the PCA-space. The projections of the clusters learned by the case-1 DCAE are shown in Figure 6.

#### 3.3. Results for Case-2

#### 3.4. Results for Case-3

## 4. Conclusions

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## References

- Rigon, A.; Soda, P.; Zennaro, D.; Iannello, G.; Afeltra, A. Indirect immunofluorescence in autoimmune diseases: Assessment of digital images for diagnostic purpose. Cytometry B Clin. Cytometry
**2007**, 72, 472–477. [Google Scholar] [CrossRef] [PubMed] - Foggia, P.; Percannella, G.; Soda, P.; Vento, M. Benchmarking hep-2 cells classification methods. IEEE Trans. Med. Imag.
**2013**, 32, 1878–1889. [Google Scholar] [CrossRef] [PubMed] - Foggia, P.; Percannella, G.; Saggese, A.; Vento, M. Pattern recognition in stained hep-2 cells: Where are we now? Pattern Recognit.
**2014**, 47, 2305–2314. [Google Scholar] [CrossRef] - Cataldo, S.D.; Bottino, A.; Ficarra, E.; Macii, E. Applying textural features to the classification of HEp-2 cell patterns in IIF images. In Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012), Tsukuba, Japan, 11–15 November 2012; pp. 689–694. [Google Scholar]
- Wiliem, A.; Wong, Y.; Sanderson, C.; Hobson, P.; Chen, S.; Lovell, B.C. Classification of human epithelial type 2 cell indirect immunofluorescence images via codebook based descriptors. In Proceedings of the 2013 IEEE Workshop on Applications of Computer Vision (WACV), Tampa, FL, USA, 15–17 January 2013; pp. 95–102. [Google Scholar] [CrossRef] [Green Version]
- Nosaka, R.; Fukui, K. Hep-2 cell classification using rotation invariant co-occurrence among local binary patterns. Pattern Recognit.
**2014**, 47, 2428–2436. [Google Scholar] [CrossRef] - Huang, Y.C.; Hsieh, T.Y.; Chang, C.Y.; Cheng, W.T.; Lin, Y.C.; Huang, Y.L. HEp-2 cell images classification based on textural and statistic features using self-organizing map. In Proceedings of the 4th Asian Conference on Intelligent Information and Database Systems, Part II, Kaohsiung, Taiwan, 19–21 March 2012; pp. 529–538. [Google Scholar]
- Thibault, G.; Angulo, J.; Meyer, F. Advanced statistical matrices for texture characterization: Application to cell classification. IEEE Trans. Biomed. Eng.
**2014**, 61, 630–637. [Google Scholar] [CrossRef] [PubMed] - Wiliem, A.; Sanderson, C.; Wong, Y.; Hobson, P.; Minchin, R.F.; Lovell, B.C. Automatic classification of human epithelial type 2 cell indirect immunofluorescence images using cell pyramid matching. Pattern Recognit.
**2014**, 47, 2315–2324. [Google Scholar] [CrossRef] [Green Version] - Xu, X.; Lin, F.; Ng, C.; Leong, K.P. Automated classification for HEp-2 cells based on linear local distance coding framework. J. Image Video Proc.
**2015**, 2015, 1–13. [Google Scholar] [CrossRef] [Green Version] - Cataldo, S.D.; Bottino, A.; Islam, I.U.; Vieira, T.F.; Ficarra, E. Subclass discriminant analysis of morphological and textural features for hep-2 staining pattern classification. Pattern Recognit.
**2014**, 47, 2389–2399. [Google Scholar] [CrossRef] [Green Version] - Bianconi, F.; Fernández, A.; Mancini, A. Assessment of rotation-invariant texture classification through Gabor filters and discrete Fourier transform. In Proceedings of the 20th International Congress on Graphical Engineering (XX INGEGRAF), Valencia, Spain, 4–6 June 2008. [Google Scholar]
- Ojala, T.; Pietikainen, M.; Maenpaa, T. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell.
**2002**, 24, 971–987. [Google Scholar] [CrossRef] - Nosaka, R.; Ohkawa, Y.; Fukui, K. Feature extraction based on co-occurrence of adjacent local binary patterns. In Proceedings of the 5th Pacific Rim Symposium on Advances in Image and Video Technology, Part II, Gwangju, South Korea, 20–23 November 2012; pp. 82–91. [Google Scholar]
- Guo, Z.; Zhang, L.; Zhang, D. A completed modeling of local binary pattern operator for texture classification. IEEE Trans. Image Process.
**2010**, 19, 1657–1663. [Google Scholar] [CrossRef] [Green Version] - Theodorakopoulos, I.; Kastaniotis, D.; Economou, G.; Fotopoulos, S. Hep-2cells classification via sparse representation of textural features fused into dissimilarity space. Pattern Recognit.
**2014**, 47, 2367–2378. [Google Scholar] [CrossRef] - Ponomarev, G.V.; Arlazarov, V.L.; Gelfand, M.S.; Kazanov, M.D. ANA hep-2 cells image classification using number, size, shape and localization of targeted cell regions. Pattern Recognit.
**2014**, 47, 2360–2366. [Google Scholar] [CrossRef] [Green Version] - Shen, L.; Lin, J.; Wu, S.; Yu, S. Hep-2 image classification using intensity order pooling based features and bag of words. Pattern Recognit.
**2014**, 47, 2419–2427. [Google Scholar] [CrossRef] - LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature
**2015**, 521, 436–444. [Google Scholar] [CrossRef] - LeCun, Y.; Huang, F.J.; Bottou, L. Learning methods for generic object recognition with invariance to pose and lighting. In Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’04), Washington, DC, USA, 27 June–2 July 2004. [Google Scholar]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. In Proceedings of the NIPS’12: 25th International Conference on Neural Information Processing Systems, Lake Tahoe, NV, USA, 3–6 December, 2012; pp. 1097–1105. [Google Scholar]
- Gao, Z.; Wang, L.; Zhou, L.; Zhang, J. Hep-2 cell image classification with deep convolutional neural networks. IEEE J. Biomed. Health Inf.
**2017**, 21, 416–428. [Google Scholar] [CrossRef] [Green Version] - Li, Y.; Shen, L. A deep residual inception network for HEp-2 cell classification. In Proceedings of the Third International Workshop, DLMIA 2017, and 7th International Workshop, ML-CDS 2017, Québec City, QC, Canada, 14 September 2017. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P. Going deeper with convolutions. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar]
- Phan, H.T.H.; Kumar, A.; Kim, J.; Feng, D. Transfer learning of a convolutional neural network for HEp-2 cell image classification. In Proceedings of the 2016 IEEE 13th International Symposium on Biomedical Imaging (ISBI), Prague, Czech Republic, 16 June 2016; pp. 1208–1211. [Google Scholar]
- Lei, H.; Han, T.; Zhou, F.; Yu, Z.; Qin, J.; Elazab, A.; Lei, B. A deeply supervised residual network for HEp-2 cell classification via cross-modal transfer learning. Pattern Recognit.
**2018**, 79, 290–302. [Google Scholar] [CrossRef] - Shen, L.; Jia, X.; Li, Y. Deep cross residual network for HEp-2 cell staining pattern classification. Pattern Recognit.
**2018**, 82, 68–78. [Google Scholar] [CrossRef] - Bayramoglu, N.; Kannala, J.; Heikkilä, J. Human epithelial type 2 cell classification with convolutional neural networks. In Proceedings of the IEEE 15th International Conference on Bioinformatics and Bioengineering (BIBE), Belgrade, Serbia, 2–4 November 2015; pp. 1–6. [Google Scholar]
- Xi, J.; Linlin, S.; Xiande, Z.; Shiqi, Y. Deep convolutional neural network based HEp-2 cell classification. In Proceedings of the 2016 23rd International Conference on Pattern Recognition (ICPR), Cancun, Mexico, 4–8 December 2016; pp. 77–80. [Google Scholar]
- Vununu, C.; Lee, S.-K.; Kwon, K.-R. A Deep feature extraction method for HEp-2 Image Classification. Electronics
**2018**, 8, 20. [Google Scholar] [CrossRef] [Green Version] - Yang, B.; Fu, X.; Sidiropoulos, N.D.; Hong, M. Towards k-means-friendly spaces: Simultaneous deep learning and clustering. In Proceedings of the 34th International Conference on Machine Learning (ICML), Sydney, Australia, 6–11 August 2017; pp. 3861–3870. Available online: https://arxiv.org/pdf/1610.04794.pdf (accessed on 9 May 2020).
- Guo, X.; Liu, X.; Zhou, E.; Yin, J. Deep clustering with convolutional autoencoders. In Proceedings of the International Conference on Neural Information Processing (ICONIP), Guangzhou, China, 14–18 November 2017; pp. 373–382. Available online: https://xifengguo.github.io/papers/ICONIP17-DCEC.pdf (accessed on 9 May 2020).
- Caron, M.; Bojanowski, P.; Joulin, A.; Douze, M. Deep clustering for unsupervised learning of visual features. European Conference on Computer Vision (ECCV), 2018. In Proceedings of the 15th European Conference on Computer Vision (ECCV 2018), Munich, Germany, 8–14 September 2018; Available online: https://arxiv.org/pdf/1807.05520.pdf (accessed on 9 May 2020).
- Llyod, S. Least squares quantization in PCM. IEEE Trans. Info. Theory
**1982**, 28, 129–137. [Google Scholar] [CrossRef] - Simonyan, K.; Zisserman, A. A very deep convolutional networks for large-scale image recognition. In Proceedings of the 2015 International Conference on Learning Representation (ICLR15), San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
- Lovell, B.C.; Percannella, G.; Saggese, A.; Vento, M.; Wiliem, A. International contest on pattern recognition techniques for indirect immunofluorescence images analysis. In Proceedings of the 2016 23rd International Conference on Pattern Recognition (ICPR), Cancun, Mexico, 4–8 December 2016; pp. 74–76. [Google Scholar]
- Bengio, Y. Learning deep architecture for AI. Foundat. Trends Mach. Learn.
**2009**, 2, 1–127. [Google Scholar] [CrossRef] - Hinton, G.E.; Salakhutdinov, R.R. Reducing the dimensionality of the data with neural networks. Science
**2006**, 313, 504–507. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Xie, J.; Girshick, R.; Farhadi, A. Unsupervised deep embedding for clustering analysis. In Proceedings of the 33rd International Conference on Machine Learning (ICML), New York City, NY, USA, 19–24 June 2016. [Google Scholar]
- Yang, J.; Parikh, D.; Batra, D. Joint unsupervised learning of deep representations and image clusters. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 26 June–1 July 2016; pp. 5147–5156. Available online: https://arxiv.org/pdf/1604.03628.pdf (accessed on 9 May 2020).
- Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional networks for biomedical image segmentation. In Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention—MICAAI 2015, Munich, Germany, 5–9 October 2015; pp. 234–241. [Google Scholar]
- Badrinarayana, V.; Kendall, A.; Cipolla, R. SegNet: A deep convolutional encoder-decoder architecture for Image Segmentation. IEEE Trans. Pattern Anal. Mach. Intell.
**2017**, 39, 2481–2495. [Google Scholar] [CrossRef] [PubMed] - Rumelhart, D.E.; Hinton, G.E.; Williams, R.J. Learning representations by back-propagating errors. Nature
**1986**, 323, 533–536. [Google Scholar] [CrossRef] - He, K.; Zhang, X.; Ren, S.; Sun, J. Identity mappings in deep residual networks. In Proceedings of the 14th European Conference on Computer Vision (ECCV), Amsterdam, The Netherlands, 8–16 October 2016; pp. 630–645. [Google Scholar]
- Cai, D.; He, X.; Han, J. Locally consistent concept factorization for document clustering. IEEE Trans. Knowl. Data Eng.
**2011**, 23, 902–913. [Google Scholar] [CrossRef] [Green Version] - Yeung, K.Y.; Ruzzo, W.L. Details of the adjusted rand index and clustering algorithms, supplement to the paper an empirical study on principal component analysis for clustering gene expression data. Bioinformatics
**2001**, 17, 763–774. [Google Scholar] [CrossRef] - Hotelling, H. Analysis of a complex of statistical variables into principal components. J. Educ. Psychol.
**1933**, 24, 417–441. [Google Scholar] [CrossRef] - Nigam, I.; Agrawal, S.; Singh, R.; Vatsa, M. Revisiting HEp-2 cell classification. IEEE Access
**2015**, 3, 3102–3113. [Google Scholar] [CrossRef]

**Figure 1.**Illustration of the proposed method. A clustering layer is embedded in the convolutional autoencoder in order to learn how to cluster the latent representations. X is the original cellular image and X′ is the reconstruction.

**Figure 2.**Illustration of the storage technique. The positions of the strongest activations are utilized in order to produce the sparse output.

**Figure 3.**The connections between the pooling layers from the encoder and the unpooling layers from the decoder.

**Figure 4.**Illustration of the proposed network using the storage and copy/concatenate techniques at the same time. The dimensions shown above each layer refers to the number of features maps (the depth of the volume). See Table 1 for the spatial dimensions of each volume.

**Figure 5.**Example images from the SNPHEp-2 dataset. (

**a**) The positive fluorescence illumination images. (

**b**) The negative fluorescence illumination images. In (

**a**) and (

**b**), from the left to the right: homogeneous, coarse speckled, fine speckled, nucleolar and centromere cells.

**Figure 6.**Visualization of the features learned by the DCAE in case-1. “Ho”, “CS”, “FS”, “Nu”, and “Ce” represent the homogeneous, the coarse speckled, the fine speckled, the nucleolar and the centromere cells, respectively. The percentages of the variance explained are, respectively, 99.23% and 0.42% for the first and second principal components.

**Figure 7.**Confusion matrix for case-1. “Ho”, “CS”, “FS”, “Nu”, and “Ce” represent homogeneous, coarse speckled, fine speckled, nucleolar, and centromere cells, respectively.

**Figure 8.**Visualization of the features learned by the DCAE in case-2. “Ho”, “CS”, “FS”, “Nu”, and “Ce” represent the homogeneous, coarse speckled, fine speckled, nucleolar and centromere cells, respectively. The percentages of the variance explained are, respectively, 83.06% and 16.38% for the first and second principal components.

**Figure 9.**Confusion matrix for case-2. “Ho”, “CS”, “FS”, “Nu”, and “Ce” represent homogeneous, coarse speckled, fine speckled, nucleolar and centromere cells, respectively.

**Figure 10.**Visualization of the features learned by the DCAE in case-3. “Ho”, “CS”, “FS”, “Nu”, and “Ce” represent homogeneous, coarse speckled, fine speckled, nucleolar, and centromere cells, respectively. The percentages of the variance explained are, respectively, 56.86% and 33.61% for the first and second principal components.

**Figure 11.**Confusion matrix for case 3. “Ho”, “CS”, “FS”, “Nu”, and “Ce” represent homogeneous, coarse speckled, fine speckled, nucleolar, and centromere cells, respectively.

**Figure 14.**Accuracy of the three networks: (a) with different values of coefficient $\gamma $, and (

**b**) with different values of k (number of clusters).

Layer | Filter Size | #Feature Maps | Stride | Padding | Output |
---|---|---|---|---|---|

Input | - | 1 | - | - | 112 × 112 |

Conv 1 | 3 × 3 | 32 | 1 | 1 | 112 × 112 |

Pool 1 | 2 × 2 | 32 | 2 | 0 | 56 × 56 |

Conv 2 | 3 × 3 | 64 | 1 | 1 | 56 × 56 |

Pool 2 | 2 × 2 | 64 | 2 | 0 | 28 × 28 |

Conv 3 | 3 × 3 | 128 | 1 | 1 | 28 × 28 |

Pool 3 | 2 × 2 | 128 | 2 | 0 | 14 × 14 |

Conv 4 | 3 × 3 | 256 | 1 | 1 | 14 × 14 |

Pool 4 | 2 × 2 | 256 | 2 | 0 | 7 × 7 |

Conv 5 | 7 × 7 | 512 | 1 | 1 | 1 × 1 |

Conv 6 | 7 × 7 | 256 | 1 | 0 | 7 × 7 |

Unpool 4 | 2 × 2 | 256 | 2 | 0 | 14 × 14 |

Conv 7 | 3 × 3 | 128 | 1 | 1 | 14 × 14 |

Unpool 3 | 2 × 2 | 128 | 2 | 0 | 28 × 28 |

Conv 8 | 3 × 3 | 64 | 1 | 1 | 28 × 28 |

Unpool 2 | 2 × 2 | 64 | 2 | 0 | 56 × 56 |

Conv 9 | 3 × 3 | 32 | 1 | 1 | 56 × 56 |

Unpool 1 | 2 × 2 | 32 | 2 | 0 | 112 × 112 |

Conv 10 | 3 × 3 | 1 | 1 | 1 | 112 × 112 |

Method | Description | Accuracy |
---|---|---|

Supervised learning Hand-crafted features | Texture features + SVM [49] | 80.90% |

DCT features + SIFT + SVM [5] | 82.50% | |

LBP + SVM [6] | 85.71% | |

Supervised Deep Learning | Simple CNN [22] | 86.20% |

Simple CNN [29] | 88.37% | |

CNN with Deep Residual Inception Module [23] | 95.61% | |

CNN using Cross-modal transfer learning [27] | 95.99% | |

CNN with a Deep-Cross Residual Module [28] | 96.26% | |

Unsupervised Deep Learning | DCAE with an embedded clustering layer (case-1) | 84.64% |

DCAE with an embedded clustering layer (case-2) | 93.16% | |

DCAE with an embedded clustering layer (case-3) | 97.56% |

Method | Description | Accuracy |
---|---|---|

Supervised Learning Hand-crafted features | Texture features + SVM [49] | 71.63% |

DCT features + SIFT + SVM [5] | 74.91% | |

LBP + SVM [6] | 79.44% | |

Supervised Deep Learning | Simple CNN with 5 layers [22] | 97.24% |

VGG-like network [29] | 98.26% | |

CNN with a Deep Residual Inception Module [23] | 98.37% | |

CNN with Cross-modal transfer learning [27] | 98.42% | |

CNN with the use of a Deep-Cross Residual Module [28] | 98.82% | |

Unsupervised Deep Learning | DCAE with an embedded clustering layer (case 1) | 89.13% |

DCAE with an embedded clustering layer (case 2) | 94.48% | |

DCAE with an embedded clustering layer (case 3) | 98.51% |

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Vununu, C.; Lee, S.-H.; Kwon, K.-R.
A Strictly Unsupervised Deep Learning Method for HEp-2 Cell Image Classification. *Sensors* **2020**, *20*, 2717.
https://doi.org/10.3390/s20092717

**AMA Style**

Vununu C, Lee S-H, Kwon K-R.
A Strictly Unsupervised Deep Learning Method for HEp-2 Cell Image Classification. *Sensors*. 2020; 20(9):2717.
https://doi.org/10.3390/s20092717

**Chicago/Turabian Style**

Vununu, Caleb, Suk-Hwan Lee, and Ki-Ryong Kwon.
2020. "A Strictly Unsupervised Deep Learning Method for HEp-2 Cell Image Classification" *Sensors* 20, no. 9: 2717.
https://doi.org/10.3390/s20092717