# Spectral Clustering of CRISM Datasets in Jezero Crater Using UMAP and k-Means

## Abstract

## 1. Introduction

## 2. Materials and Methods

#### 2.1. Data and Location

**Figure 1.**Location map of used cubes in the present work. (

**A**) HRSC MC-13 quadrant color basemap [44]: of Nili Fossae and surrounding areas, including Jezero, in the highlighted subset. (

**B**) Jezero Crater CTX mosaic [45,46] with indicated CRISM observation HRL000040FF, highlighed in white of overlapping CRISM MTRDR data covering its delta. (

**C**) IR enhanced color composite (FAL) using as RGB R2529, R1506, R1080 [26] for CRISM observation HRL000040FF.

#### 2.2. Dimensionality Reduction

#### 2.3. Data Pipeline

#### 2.4. Quantitative Metrics

## 3. Results

#### 3.1. Quantitative Analysis

#### 3.2. Qualitative Analysis

## 4. Quantitative Geological Mapping

Algorithm 1: Quantitative geological mapping |

## 5. Discussion

## 6. Conclusions

## Author Contributions

## Funding

## Data Availability Statement

## Conflicts of Interest

## Abbreviations

CRISM | Compact Reconnaissance Imaging Spectrometer |

DB | Davies-Bouldin |

GMM | Gaussian Mixture model |

MRO | Mars Reconnaissance Orbiter |

MTRDR | Map-projected Targeted Reduced Data Record |

PCA | Principal omponent analysis |

SC | Silhouette Coefficient |

SCM | Spectral Cluster Map |

t-SNE | t-distributed Stochastic Neighbor Embedding |

UMAP | Uniform Manifold Approximation and Projection |

## Appendix A

#### Appendix A.1

**Table A1.**Mean of the Calinski-Harabasz and Davies-Bouldin criterion over a range of 5 to 20 clusters for the FRT0000c564 dataset, split by method. The best score for each coefficient is in bold.

Clustering Metrics | ||
---|---|---|

Methods | Calinski-Harabasz | Davies-Bouldin |

UMAP + k-Means | 110,568 | 0.7939 |

UMAP + GMM | 89,283 | 0.8535 |

Autoencoder + k-Means | 28,345 | 1.2651 |

Autoencoder + GMM | 14,133 | 2.0985 |

PCA + k-Means | 40,415 | 1.2055 |

PCA + GMM | 13,995 | 2.8828 |

t-SNE + k-Means | 94,857 | 0.8092 |

t-SNE + GMM | 88,134 | 0.8222 |

**Table A2.**Mean of the Calinski-Harabasz and Davies-Bouldin criterion over a range of 5 to 20 clusters for the FRT0000b776 dataset, split by method. The best score for each coefficient is in bold.

Clustering Metrics | ||
---|---|---|

Methods | Calinski-Harabasz | Davies-Bouldin |

UMAP + k-Means | 216,545 | 0.7192 |

UMAP + GMM | 196,229 | 0.7545 |

Autoencoder + k-Means | 39,897 | 1.1298 |

Autoencoder + GMM | 16,721 | 2.1699 |

PCA + k-Means | 59,637 | 1.3214 |

PCA + GMM | 21,027 | 4.3664 |

t-SNE + k-Means | 152,121 | 0.8279 |

t-SNE + GMM | 144,334 | 0.8501 |

**Table A3.**Mean of the Calinski-Harabasz and Davies-Bouldin criterion over a range of 5 to 20 clusters for the FRT0001c71b dataset, split by method. The best score for each coefficient is in bold.

Clustering Metrics | ||
---|---|---|

Methods | Calinski-Harabasz | Davies-Bouldin |

UMAP + k-Means | 164,883 | 0.6932 |

UMAP + GMM | 139,372 | 0.7362 |

Autoencoder + k-Means | 31,184 | 1.0316 |

Autoencoder + GMM | 17,879 | 1.7953 |

PCA + k-Means | 56,562 | 0.9992 |

PCA + GMM | 25,703 | 1.6261 |

t-SNE + k-Means | 92,332 | 0.8048 |

t-SNE + GMM | 86,278 | 0.8155 |

## Appendix B. Citation of PDS Data Products

HRL000040FF |

FRT0000c564 |

FRT0000b776 |

FRT0001c71b |

**Figure 2.**Silhouette Score of UMAP+k-Means as a function of the number of clusters for HRL000040FF dataset.

**Figure 4.**On the left side, the expert map used by Gao et al. [11] is presented. Each class is associated with a different color. In total, 6 classes are clustered as follows: olivine, yellow; pyroxene, orange; carbonate, green; Fe/Mg smectite, blue; silica, magenta and unclassified area, gray. On the right side, the UMAP+k-Means generated spectral cluster map with 6 identified clusters is illustrated. The same detail, as captured by expert map (

**a**), is shown.

**Figure 5.**Spectral cluster map by UMAP-k-Means and 6 clusters for the complete clustered area of Jezero Crater.

**Figure 6.**Mean spectra per cluster as representative fingerprint. Key unique absorptions at 1900 nm (water in minerals), 2300 nm and 2500 nm (carbonate) and 2300 nm (Fe/Mg smectite) are marked with vertical dotted lines.

**Figure 7.**The MAF browse product of the MTRDR product HRL000040FF. This image browse product shows information related to mafic mineralogy and denotes olivine and Fe-phyllosilicate in red color [26].

**Figure 8.**The two novel classes identified by the UMAP+k-Means and pictured as an overlay of the true image of Jezero Crater. Left: cluster 2 embraces mainly the unclassified area of expert map (cf. Figure 4a). Right: cluster 6 indicates a new mineralogy class.

**Table 1.**Mean of the Calinski-Harabasz and Davies-Bouldin criterion over a range of 5 to 20 clusters for the HRL000040FF dataset, split by method. The best score for each coefficient is in bold.

Clustering Metrics | ||
---|---|---|

Methods | Calinski-Harabasz | Davies-Bouldin |

UMAP + k-Means | 114,928 | 0.8179 |

UMAP + GMM | 109,469 | 0.8349 |

Autoencoder + k-Means | 25,649 | 1.2435 |

Autoencoder + GMM | 12,648 | 2.5199 |

PCA + k-Means | 53,478 | 1.0643 |

PCA + GMM | 19,594 | 2.6050 |

t-SNE + k-Means | 78,578 | 0.8072 |

t-SNE + GMM | 75,248 | 0.8120 |

Cluster | Selected Products | Geology | Expert Map |
---|---|---|---|

1 | HCPINDEX2, CINDEX2, BD1750_2, | Carbonates | Carbonates |

2 | BD1750_2, HCPINDEX2 | Gypsum, Alunite | unclassified |

3 | CINDEX2, RPEAK1 | Fe, Fe-Carbonate | Fe |

4 | D2300, HCPINDEX2 | Pyroxene, Silicates | Pyroxene |

5 | HCPINDEX2, RPEAK1 | Fe-mineralogy (suggest Olivine) | Olivine |

6 | BD1750_2, OLINDEX3, HCPINDEX2 | Gypsum, Alunite, Olivine | unclassified |

