# Unsupervised Clustering of Hyperspectral Paper Data Using t-SNE

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Materials and Methods

#### 2.1. Hyperspectral Acquisition

#### 2.2. Samples and Data

#### 2.3. t-Distributed Stochastic Neighbor Embedding (t-SNE)

_{i}) is the Shannon entropy P

_{i}measured in bits.

_{i}, such that the effective number of neighbors coincides with the user provided perplexity [22]. The t-SNE uses the Student t-Distribution with a single degree of freedom, to avoid overcrowding. Using this distribution, the probability at low dimension q

_{ij}, can be defined as shown in the equation below.

_{i}in lower dimension as y

_{i}.

#### 2.4. Principal Component Analysis (PCA)

#### 2.5. Clustering Performance Evaluation

#### 2.6. Data Processing

## 3. Results and Discussions

## 4. Conclusions and Future Work

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

**Figure 3.**Paper checker. The left-hand side describes the types of paper, with the corresponding sample on the right-hand side. Identification numbers are marked above each sample.

**Figure 6.**Clustering results of 40 paper samples, with a sample size of 100 spectra. Left-hand plot is obtained using Principal Component Analysis (PCA), and the right-hand plot is obtained using t-Distributed Stochastic Neighbor Embedding (t-SNE).

Validation Indices | PCA | t-SNE |
---|---|---|

NMI | 0.72 | 0.92 |

HI | 0.70 | 0.92 |

CI | 0.75 | 0.92 |

SI | 0.34 | 0.44 |

Sample Count | 25 | 64 | 100 | 225 | 625 | 900 | 1600 | 2500 | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

Optimal Perplexity | 25 | 100 | 100 | 300 | 300 | 600 | 600 | 1000 | ||||||||

PCA | t-SNE | PCA | t-SNE | PCA | t-SNE | PCA | t-SNE | PCA | t-SNE | PCA | t-SNE | PCA | t-SNE | PCA | t-SNE | |

NMI | 0.78 | 0.94 | 0.76 | 0.90 | 0.74 | 0.92 | 0.73 | 0.91 | 0.70 | 0.93 | 0.70 | 0.92 | 0.69 | 0.92 | 0.69 | 0.92 |

HI | 0.75 | 0.94 | 0.73 | 0.90 | 0.71 | 0.92 | 0.71 | 0.90 | 0.67 | 0.93 | 0.68 | 0.92 | 0.67 | 0.92 | 0.67 | 0.92 |

CI | 0.81 | 0.94 | 0.79 | 0.91 | 0.76 | 0.92 | 0.75 | 0.91 | 0.72 | 0.94 | 0.73 | 0.92 | 0.71 | 0.92 | 0.72 | 0.92 |

SI | 0.39 | 0.51 | 0.38 | 0.48 | 0.37 | 0.46 | 0.34 | 0.46 | 0.31 | 0.42 | 0.33 | 0.43 | 0.29 | 0.41 | 0.31 | 0.39 |

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

