Generation and Optimization of Spectral Cluster Maps to Enable Data Fusion of CaSSIS and CRISM Datasets

: Four-band color imaging of the Martian surface using the Color and Stereo Surface Imaging System (CaSSIS) onboard the European Space Agency’s ExoMars Trace Gas Orbiter exhibits a high color diversity in speciﬁc regions. Not only is the correlation of color diversity maps with local morphological properties desirable, but mineralogical interpretation of the observations is also of great interest. The relatively high spatial resolution of CaSSIS data mitigates its low spectral resolution. In this paper, we combine the broad-band imaging of the surface of Mars, acquired by CaSSIS with hyperspectral data from the Compact Reconnaissance Imaging Spectrometer (CRISM) onboard NASA’s Mars Reconnaissance Orbiter to achieve a fusion of both datasets. We achieve this using dimensionality reduction and data clustering of the high dimensional datasets from CRISM. In the presented research, CRISM data from the Coprates Chasma region of Mars are tested with different machine learning methods and compared for robustness. With the help of a suitable metric, the best method is selected and, in a further step, an optimal cluster number is determined. To validate the methods, the so-called “summary products” derived from the hyperspectral data are used to correlate each cluster with its mineralogical properties. We restrict the analysis to the visible range in order to match the generated clusters to the CaSSIS band information in the range of 436–1100 nm. In the machine learning community, the so-called UMAP method for dimensionality reduction has recently gained attention because of its speed compared to the already established t-SNE. The results of this analysis also show that this method in combination with the simple K-Means outperforms comparable methods in its efficiency and speed. The cluster size obtained is between three and six clusters. Correlating the spectral cluster maps with the given summary products from CRISM shows that four bands, and especially the NIR bands and VIS albedo, are sufficient to discriminate most of these clusters. This demonstrates that features in the four-band CaSSIS images can provide robust mineralogical information, despite the limited spectral information using semi-automatic processing.


Introduction
One of the major difficulties in the investigation of our solar system is that high resolution datasets returned by in situ orbiting spacecraft are usually incomplete, either spatially, spectrally or both.Observations of the surface of Mars have shown that highresolution remote sensing is needed to establish the physico-chemical properties of specific areas.Only then can interpretation of the processes involved follow and other aspects, such as the suitability for a future landing, be considered.High resolution, however, implies high data volume with reduced surface coverage.Instruments such as the High Resolution Imaging Science Experiment (HiRISE) [1] and Compact Reconnaissance Imaging Spectrometer (CRISM) [2] onboard NASA's Mars Reconnaissance Orbiter (MRO) and CaSSIS [3] onboard the European Space Agency's (ESA) ExoMars Trace Gas Orbiter (TGO) are good illustrations of the problem.All three instruments have high resolution, their data in total amounts to less than 5% surface coverage of the planet.Nonetheless, the respective datasets are large suggesting that automated processing techniques can produce significant benefits.This has led us to pose the question of whether imaging spectrometer datasets from CRISM can be linked to broad-band imaging datasets from CaSSIS to improve interpretation through both spatial interpolation of the spectra and extrapolation by taking advantage of redundancy in the spectral domain.
Spectral data are complex because of resolution issues and the high number of bands and subclasses.Therefore, our paper relies on unsupervised classification, which is an important standard procedure in geospatial analysis [4], especially when analyzing hyperspectral data with insufficient calibration in-field data.For analysis purposes, the combination of band information and spatial distributions is formed into a data structurein this paper called Spectral Cluster Maps (SCMs).High dimensional data are transferred to a low latent variable representation by directly applying advanced methods on the full spectrum itself, and these clusters can be related to the underlying geochemical composition (compare Gao et al. [5]).Therefore, it is essential to find suitable unsupervised dimensionality reduction techniques to produce accurate SCMs before applying various clustering algorithms on the feature space.The principal component analysis (PCA) [6] is the most commonly used technique applied to spectral data (e.g., [7,8]) and we use this here to benchmark against more elaborate algorithms.In recent studies of Machine Learning Networks, approaches such as t-SNE [9] have achieved promising results.Distinct grouping has been obtained by focusing on more local structures and mapping the feature space into a low dimensional representation.Compare among others the works of Pouyet et al. and Song et al. [10,11] or the self-organizing maps technique (SOM) developed by Kohonen [12], which has already been proposed for generating spectral databases.Specifically for Mars, a recent study published by Gao et al. [5] proposes the autoencoder technique for spectral application.The application of the UMAP technique to spectral data is relatively rare at present.Groups tackling this issue include Picollo et al. [13] and Wander et al. [14].The most relevant work in this context is the recently published paper by D'Amore and Padovan [15] using UMAP for mapping reflectance spectra from Mercury.Publications using UMAP are more abundant in the biology research field [16,17]; however, due to its speed and robustness, its wider application for planetary science is desirable.
Generating Spectral cluster maps from CRISM and Cassis data has a direct impact on geologic mapping activities (e.g., [18]).The planetary geologic mapping process itself relies on basic geometric and stratigraphic principles, historically limited by the availability of image and topographic data.The availability of compositional data in recent decades allowed the inclusion of different kind of methods, varying from heuristical methods to statistical approaches [19][20][21][22].
To assess the feasibility and efficacy of our approach, we have selected Coprates Chasma, as the region exhibits significant mineralogical (color) diversity in CRISM and CaSSIS observations.The rest of this paper is structured as follows: Section 2 describes the data and the preprocessing used.The examined dimensionality reduction techniques and clustering algorithms used in this study are also briefly presented.In Section 3, the obtained results are illustrated and discussed.Section 4 proceeds with a geological mapping based on the image products and links to the CaSSIS data.The paper finishes with a brief conclusion (Section 5).

Materials and Methods
This section is devoted to the machine learning techniques considered in this study.The data and their origin are also described.

Data Source
CRISM is a high spectral resolution visible and infrared mapping spectrometer currently in orbit around Mars onboard NASA's Mars Reconnaissance Orbiter (MRO) [2].For this analysis, hyperspectral datasets (compare Appendix B for a list) provided by John Hopkins University through the Planetary Data System hosted at Washington University St. Louis [23] were selected.
CRISM provides 2D spatially resolved spectra over a wavelength range of 362 nm to 3920 nm at 6.55 nm/channel.The spatial resolution is typically around 18 m/px.Pelkey et al. [24] and Viviano et al. [25] generated a feature set of "image products" from CRISM spectra, which are strongly related to the geochemical composition of the Martian surface.
The CaSSIS instrument is a high spatial resolution color and stereo imager [3] currently in orbit around Mars onboard the European Space Agency's ExoMars Trace Gas Orbiter (TGO).CaSSIS returns images at 4.5 m/px from the nominal 400 km altitude orbit in four colors using a push-frame technique.The images typically sample an area of approximately 9 km × 40 km on the Martian surface with around 24 images per day being acquired.The filters were selected to provide good mineral diagnostics in the visible wavelength range (400-1100 nm) and to complement the filters in the extremely high-resolution HiRISE system onboard NASA's Mars Reconnaissance Orbiter (MRO).
Studies by Tornabene et al. [26] illustrated the potential for mineralogical diagnostics using preflight calibration data.Good performance in flight has also been established.Parkes Bowen et al. [27,28] have demonstrated the effectiveness of CaSSIS through identification of two spectrally and morphologically distinct subunits of the Oxia Planum (the ExoMars rover landing site) clay unit-one indicative of Fe/Mg-rich clay minerals and one showing decametre scale fracturing with Fe/Mg-rich clay mineral/olivine signatures.
The data are radiometrically and geometrically well-calibrated in absolute units (i.e., "I/F" as commonly used in the planetary community) [29,30].

Data and Location
Coprates Chasma, a central part of Valles Marineris tectonic system on Mars, was selected for initial studies.Coprates Chasma is a 1000 km long, 100 km wide linear trough connecting Melas Chasma (central Valles Marineris) to Capri Chasma (eastern Valles Marineris).The CRISM data files used for this study were the MTRDR products FRT0000d3a4, FRT0001c479 and FRT0001c71b (compare Figure 1).The area exhibits significant color diversity at visible wavelengths and is of major interest in studies of the history of liquid water on Mars.Geologic features, including fluvial topography (e.g., [31,32]) and the distribution of a variety of aqueous minerals including sulfates, are evidence of the influence of water in Valles Marineris [33].
Weitz and Bishop [34] investigated the morphology, mineralogy and stratigraphy of light-toned layered deposits in the same region and found numerous hydrated minerals, including Al-phyllosilicates, Fe/Mg-phyllosilicates, hydrated silica, hydrated sulfates, jarosite and acid alteration products based on visible to near-IR spectral analysis (e.g., [35]).They suggest that valleys sourced from water along the plateau may have flowed downward into one or more troughs with changing aqueous chemistry resulting in the diverse mineralogies.
Fueten et al. [36] investigated layered deposits in northern Coprates Chasma.Here, hydrated sulfates have been detected indicating alteration or deposition by liquid water.There is also evidence of pedogenesis (weathering of basaltic soils by continuous exposure to water percolating down from the surface), which can result in layers of aluminium phyllosilicates forming over layers of iron-magnesium phyllosilicates [37] on the plateau around Coprates Chasma [38].The primary test area is centred on an exposure of lightertoned material within a 25 km diameter crater.It was expected to give clear mineralogically diverse signals in CRISM and CaSSIS data.

Preprocessing
Our preprocessing steps are similar to the steps in Gao et al. [5]: we select a subimage with a size of 400 × 400 pixels from the image area in order to exclude unwanted empty areas from the calculation.Like Gao et al. [5] we also perform a per pixel normalization by removing values outside 0 and 1.To cover the range of the CaSSIS instrument, we restrict data to a wavelength range of 436-1106 nm and adjust the preprocessing accordingly.This range corresponds to 88 channels of the CRISM hyperspectral dataset.In summary, we have (400 × 400) × 88 vectorized images.

Dimensionality Reduction Techniques
We compared several techniques for dimensionality reduction and feature extraction.Dimensionality reduction is needed to reduce the high data volume into a feature space with lower dimension while keeping the relevant information.
We introduce each technique briefly.

Autoencoder
Generally speaking, an autoencoder is an unsupervised feature extraction procedure based on a neural network.It consists of three main components: an encoder network, a latent feature representation and a decoder network.The concept of the encoder is to re-compile the data such that the main information of the input is represented by a certain number of latent variables.The dimensionality of the reduced feature space is a user-chosen positive number.
The aim of the decoder is to rescale the encoder output to the initial shape of the data, as described by Kovenko et al. [39].The model is trained by using back-propagation.More information on this topic can be found in [40].
To measure the accuracy during the training process, a loss function is employed, which has to be minimized.For an autoencoder, it is common practice to use the wellknown mean squared error or mean absolute error to evaluate performance.In this paper, we adopt the approach of Gao et al. [5] and insert the spectral angle (SA) as a loss function.This is denoted by: where x is the input data and x is the reconstructed dataset.As pointed out by Gao et al. [5] this maintains the capability of capturing small features in the spectra.

t-SNE
The t-distributed Stochastic Neighbor Embedding (t-SNE) technique, introduced by van der Maaten and Hinton in 2008 [41], is a pioneering approach for cutting down multidimensional data.Because of its remarkable ability to scale high-dimensional data to lower dimensions, acceptance and adoption is rising in the machine learning community [9].The idea is to express the similarities between two points x i and x j as conditional probabilities p j|i by converting the Euclidean distances: where σ i is the variance of the Gaussian distribution that is centered on data point x i .
For the lower-dimensional representation, a similar conditional probability q j|i is likewise calculated for y i and y j assigning to the high-dimensional data points x i and x j : In order to avoid overcrowding, a Student t-distribution with one degree of freedom is used here to model the probabilities.The projections, y i and y j , have to be mapped in the way that they correctly rebuild the similarities between the high-dimensional data points, implying that the conditional probabilities p j|i and q j|i are equal.
Similar to the autoencoder, an iterative algorithm is exploited to minimize a cost function denoted by the Kullback-Leibler divergence [42].An input parameter to the t-SNE algorithm is the perplexity, which can be construed as a smoothness measure of the effective number of neighbors.

UMAP
In 2018, McInnes and Healy [43] presented the Uniform Manifold Approximation and Projection (UMAP) as a method for dimensionality reduction and data visualization.The idea and computation resembles the one for t-SNE to a large extent.A concise overview of the algorithm is given by Allaoui et al. [44].UMAP aims to represent the dataset in a fuzzy topological structure.In order to build such a structure, the data points are represented in a high-dimensional weighted graph.Each edge weight depicts the probability that two points are connected and is defined by: where d(x i , x j ) depicts the distance between the i-th and j-th data points and ρ i is the distance between i-th data points and its first nearest neighbor.Analogous to t-SNE, a lower-dimensional representation has to be determined which properly reproduces the relations of the data points in the high dimensional graph.To model these low dimensional similarities, UMAP uses a distribution similar to the Student t-distribution: In the default UMAP implementation, a ≈ 1.93 and b ≈ 0.79 are used but setting a = 1 and b = 1 results in the Student t-distribution applied in t-SNE [43].
For optimization the low-dimensional representation UMAP uses binary cross-entropy as a cost function.It is also necessary to specify the number of nearest neighbors.As outlined by Vermeulen et al. [45], this parameter controls how UMAP handles local versus global structure in the data.A small value affects concentration on very local structure, while a larger value provokes UMAP to search for larger neighborhoods.
For benchmarking the proposed techniques, we implement the standard statistical principal component analysis (PCA) in our data pipeline.

Clustering Algorithms
In our analysis, we use well-known and established procedures for clustering the data.The K-Means clustering algorithm was published 1967 by MacQueen [46].Starting by initializing a set of k cluster centers, K-Means aims to minimize the Euclidean Distance between all data points x and their corresponding cluster centers m i of the cluster set C.
The Gaussian Mixture model (GMM) inserts Gaussian distributions and evaluates cluster membership based on likelihoods rather than distances [47].The cluster centers are the means of the distributions.
To overcome the potential issue of uncertainty in the clustering assignment, the Fuzzyc-Means clustering algorithm can be applied.Each data point can be assigned to several clusters by allocating probabilities with which it belongs to each cluster [48].
The Self-Organizing Maps (SOM) technique developed by Teuvo Kohonen [12,49] is another neural network based approach which projects high dimensional datasets into a low-dimensional representation, inspired by the different neurological sensory mapping in the cortex of the brain.This mapping can be achieved by different kinds of "self-organized" unsupervised learning techniques.

Experiment
In the previous section, various dimensionality reduction techniques and clustering algorithms were introduced.To evaluate each approach in its ability to generate wellclustered SCMs, the experiment was designed as follows: using three different preprocessed CRISM datasets (FRT0000d3a4, FRT0001c479 and FRT0001c71b), we examine each method for its clustering property.A method was defined by the combination of the discussed dimensionality reduction techniques and clustering algorithms.In total, a set of 16 different methods were studied.
For the autoencoder, we follow Gao et al. [5] and determine the number of latent variables by HySime [50].As an activation function, a rectified linear unit (ReLU) was used.To project the data to a lower dimensional space we continued to use PCA with five extracted components, since this number of components explain about 95% of the variance in the data.In the case of t-SNE and UMAP, the original spectral dimension was reduced to two-dimensional data.The perplexity and neighbors parameter, respectively, used for the manifold approximation was set to 100.The clustering was performed by using the standard implementations of all algorithms [51-53].As the true number of classes is not known, the number of clusters under investigation ranged from 2 to 20 clusters.
To have a better idea of the shape of the obtained results, Figure 2 shows a subset of the generated SCMs.Each image features a different investigated method for a number of 10 clusters, based on the FRT0001c71b dataset.Cluster membership is characterized by color in the pictures.We illustrated the results of an autoencoder in Figure 2a as comparison to the method established by Gao [5]. Figure 2b shows the results using the standard PCA method, while Figure 2c shows our result using UMAP.Analysis of mineralogy will be later be discussed in our analysis of image browse products.
The structure of the individually produced cluster maps does not differ fundamentally.In particular Figure 2a,c exhibit very similar clustering properties.However, there are differences in the details, indicating that some algorithm combinations are better than others.

Evaluation
To assess the clustering performance in a quantitative manner, we compute multiple unsupervised cluster-separation metrics for evaluation.To start with, the Calinski-Harabasz (CH) index [54] for a dataset E with n E pixels and split into k clusters is defined as the ratio of the dispersion between and within clusters: where: with C q denoting the set of points in cluster q, c q the center of cluster q, c E the center of E and n q the number of points in cluster q.The measure indicates a higher score when clusters are dense and well separated.The Davies-Bouldin (DB) index [55] is based on the average similarity between each cluster i and its most similar one j and is given by: where: is the cluster similarity measure, s i is the cluster diameter and d ij is the distance between cluster centroids i and j.A lower score refers to a higher cluster validity.As a final measure, the Silhouette Coefficient is bounded between −1 for incorrect clustering and +1 for highly dense clustering whereby scores around zero portend to overlapping clusters.Thus, a significant advantage of this metric is that it allows direct conclusions about the goodness of the clustering algorithm.The Silhouette Coefficient (SC) [56] for a single sample can be written as: The measure is the ratio of the mean distance a between a point and all other points in the same group and the mean distance b between the point and all samples in the next nearest cluster.The value of SC for a produced SCM is the average of the coefficient for each pixel.
In order to identify the most appropriate method, the metrics are reported over all the calculated number of clusters.The first two measures (CH, DB) are fast to compute and will be shown as a baseline.Figure 3 illustrates the values of the scores plotted against the number of clusters for the same dataset and methods as the pictured SCMs in Figure 2. Some of the graphs show strong fluctuations.However, there exists a high level of evidence for a rather small number of clusters according to the reported scores (compare Figure 4).Therefore, we proceed with a further analysis in which the mean of the measures over a predefined number of clusters is calculated.As pointed out, there is a strong trend for a small number of classes.Thus, the examination was restricted to a class range of 3 to 7.
In Table 1, we list the mean values of the CH score for all three CRISM selected spectral cubes and all methods.As outlined by Milligan and Cooper [57], the CH score is a powerful criterion for evaluating the validity of clustering (compare Appendix A for the DB index).Apart from a few exceptions, there is a consistent pattern between the Calinski-Harabasz and Davies-Bouldin metric when establishing rank statistics of the individual scores for each dataset where a higher rank invokes denser clusters.On first viewing, some general remarks can be made.The most striking one is that the UMAP in combination with any examined clustering algorithm performs best across all three spectral images in terms of the CH score whereas the Autoencoder + GMM displays the lowest score in two of three cases.Broken down by dimensionality reduction technique, there is no clear ranking after UMAP.However, the results suggest that the t-SNE approach is also capable of outperforming the Autoencoder and PCA in the case of the FRT0000d3a4 and FRT0001c479 data.
Another finding is that K-Means and Fuzzy-c-Means have higher scores in comparison with the other clustering algorithms for the same feature extraction approach.However, this statement should be treated with caution, since both the CH and the DB index tend to higher and lower scores, respectively, for convex clusters like those generated by K-means and Fuzzy-c-Means.
Nevertheless, UMAP + K-Means can be identified as the best method in our experimental setting.
Additionally, we tracked the computation time for each dimensionality reduction technique.On average, PCA is the fastest technique with 4 s CPU time, while the duration for the t-SNE is much longer (1675 s).The Autoencoder (559 s) and UMAP (789 s) rank in the middle field.(Processor: Intel Xeon Gold 6140, 2.3 GHz, 8 virtual CPUs).
After the evaluation and selection of the best method to create SCMs from our Mars data, we need to determine the most appropriate number of clusters.For this purpose, we take the Silhouette coefficient as a validation measure as it also enables us to make some remarks about the goodness of the clusters in general.In Figure 4, the metric is reported for the UMAP + K-Means method over a reduced range of possible clusters.
In general, all graphs confirm the choice of a small number of classes as the values for the coefficient drop with an increasing number of clusters.The corresponding number of clusters to the spotted maximum score for each region is as follows: three clusters for FRT0000d3a4, four clusters for FRT0001c479 and six clusters for FRT0001c71b.These findings are mostly in line with the CH and DB index.Furthermore, all maximum scores are about 0.50, and hence, indicate an accurate clustering in large parts of the array.
To demonstrate the superior performance of the UMAP dimensionality reduction technique in the context of Mars data and particularly of the UMAP + K-Means approach compared with the remaining methods, Figure 5 shows the Silhouette coefficient for the identified number of clusters.While the Autoencoder, PCA and t-SNE show similar values across the four clustering algorithms, the UMAP exhibits significantly higher scores.

Discussion
According to the results of all metrics (Calinski-Harabasz and Davies-Bouldin), the UMAP combined with the K-Means cluster procedure clearly shows the best scores (compare Table 1 and Table A1).At the same time, it has a moderate computing time compared with the PCA calculations.Consequently, this method was selected and was optimized with respect to the cluster size.The aforementioned metrics for evaluating clustering performance can be applied.Figure 4 shows that a cluster size of 3, 4 or 6 is proposed for the individual datasets investigated in this study.The final cluster map overlaid on the images is shown in Figure 6a for FRT0001c71b.
Gao et al. [5] used expert maps to assign geological properties to the clusters, in order to relate the clusters to geo-morphological properties of the surface.Similarly, we use the so-called summary products derived by Pelkey et al. [24] and Viviano et al. [25], which retrieve mineralogical information by evaluating specific band structures within the given spectral range.For a detailed description, we refer to Pelkey et al. [24].We try to determine which minimum set of these products can be used as input features in a random forest model to classify the SCM labeled pixel with a good accuracy.The dataset consisting of the SCMs and its corresponding browse products from image set FRT0001c71b were split into training and test sets using a ratio of 0.25.Applying common feature reduction procedures, it shows that four spectral bands are sufficient without a large loss in prediction accuracy.Furthermore, we compute permutation importance for feature evaluation using the random forest classifier as an estimator.Our analysis shows that four of the five features of the subsequent list are dominant for all datasets: RBR, R770, BD860_2, BDI1000VIS and R1080.Besides the expected reflectance at 770 nm, the bands in the NIR dominates.Viviano et al. [25] related this to the presence of olivine and pyroxenes.The 860 nm band plays a dominant role in discriminating ferric minerals, such as hematite [25].The positions of the final features within the spectra are shown in Figure 7; they basically describe the silhouette of the spectra dividing it into four distinct areas.It is plausible to relate these given clusters to geomorphical compositions.Coprates Chasma shows brighter areas, which we can see in Figure 6.According to Loizeau et al. and Fueten et al. [36,58] light tone areas could exhibit hydrated minerals and consist of hydrated minerals.Further analysis has to show if such clusters can be more diversified depending on the different minerals.It is, therefore, now feasible to apply the CaSSIS filter response to the spectral data.We calculate four new CaSSIS-like features.Within the wavelength range of CaSSIS, there are electronic transitions and crystal field effects caused by the presence of ferrous Fe 2+ ironbearing minerals that produce diagnostic absorptions between 700 and 1100 nm (e.g., mafic minerals such as olivine and pyroxene).The CaSSIS sensitivity range also includes diagnostic broad absorptions which arise from intervalence charge-transfer transitions of ferric iron Fe 3+ and O 2− and are present in altered ferric (Fe 3+ ) iron-bearing minerals (e.g., hematite, nontronite, etc.) [26].
The CaSSIS filters were selected to give good overlap with the HiRISE bandpasses but splitting the NIR bandpass in HiRISE into two separate bandpasses (RED and NIR in CaSSIS) is needed to improve mineralogical distinction.The HIRISE filters are given in McEwen et al. [1], while the CaSSIS filters are described in Thomas et al. [3] or Gambicorti et al. [59,60].Table A2 in the Appendix C provides a summary of the bandpasses.The effective central wavelengths of the CaSSIS filters (taking into account the optical transmission and detector response) are also given.
Similarly we apply a K-Means algorithm on the selected summary products and on the four CaSSIS features dataset to generate similar cluster maps.These maps are shown in Figure 6b,c.A strong visual correlation between these maps can be clearly seen, which enables principally to retrieve basic information about both the surface structure and the Fe-mineralogy.

Conclusions
In this paper, a simple fast method is proposed to derive spectral clusters from hyperspectral data in the visible wavelength range.The analyses show that the UMAP algorithm in combination with the K-Means clustering method provides results quickly and, based on common cluster metrics, provides comparable or even better results than other proposed methods.With respect to the evaluation and combination of large hyperspectral datasets, this can be a decisive factor.
Comparison with the similarly generated maps based on summary products demonstrate the high information content of four bands partially from NIR, discriminating especially the Fe mineralogy within that area.The reduced number of relevant bands and a proposed cluster size between 3 and 6 confirm that the four CaSSIS filter bands were a reasonable selection sufficient for mineralogical analyses (at least for the Coprates Chasma region) at visible wavelengths.
This proposed methodology can also be utilized to vary filter parameters and to propose new settings for future missions.Additionally, it is possible to use this procedure to generate "new" or slightly different combinations of spectral bands resulting in different image browse product.
It must be emphasized that the results can depend strongly on the data selection, the preprocessing and the signal-to-noise ratio.Thus, this procedure should rather be used as an aid with other analyses.Therefore, further iterative optimization of the procedure regarding robustness and the extension of the analyses to the geologically more relevant wavelengths in the NIR range are planned.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A
Table A1.Mean of the Davies-Bouldin criterion over a range of three to seven clusters, split by method and region.The best score for each dataset is in bold.

Figure 1 .
Figure 1.Location map of the used cubes in the present work.Left: color-coded MGS MOLA hillshade over Coprates Chasma and surrounding plateau; the white outline indicates the extent of the right panel (data).Right: location of the three overlapping CRISM observations used here; the background imagery consists of HRSC Level4 Nadir imagery, orbit h7201.

Figure 3 .
Figure 3. Calinski-Harabasz and Davies-Bouldin index as a function of the number of clusters for the FRT0001c71b dataset.Each subfigure (a-c) represents quantitative analysis for the same combinations of the dimensionality reduction technique and clustering algorithm as in Figure 2.

Figure 4 .
Figure 4. Silhouette Score UMAP + K-Means as a function of the number of clusters for all three datasets.

Figure 6 .
Figure 6.Spectral cluster maps generated for a configuration of six clusters and the FRT0001c71b dataset.In (a) Spectral cluster map, UMAP + K-Means is applied, in (b) Cluster map based on four selected summary products and in (c) Cluster map based on CaSSIS bands.

Figure 7 .
Figure 7. Spectral information with supporting bands, dividing area into four distinct ranges.

Author
Contributions: Writing-original draft, M.F. and A.P.; Writing-review & editing, N.T., A.P.R. and B.E.All authors contributed substantial work at every stage of this publication.All authors have read and agreed to the published version of the manuscript.Funding: A.P.R. has been supported by the Europlanet H204 RI and has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No. 871149.

Table 1 .
Mean of the Calinski-Harabasz criterion over a range of 3 to 7 clusters, split by method and region.The best score for each dataset is in bold.