1. Introduction
The materials exposed at the surface of Mars are varied in composition and are found in a range of states of aggregation. Dust, sandy soils and sediments, pebbles, and rocks, are globally distributed, have been observed over three decades of orbiter missions to Mars, and have been analysed and characterised at outcrop scale by lander instrumentation. Detailed mapping of the distribution and spatial variation of Mars’s surface materials is a necessary and important task. The distribution of materials needs to be known when planning for the selection of landing sites, due to: (a) engineering requirements for landing instrumentation; and, (b) the selection of sites of scientific interest. Determination of rock-size frequencies is essential for evaluating safety in the landing and operations of rovers [
1]. It is important to avoid landing in terrains covered by thick layers of dust or loose sediments as landing locations must be selected to include outcrops of geological significance to collect scientific data [
2]. Knowledge of the spatial variation of surface materials is also significant from a purely scientific viewpoint, because what is presently observed represents a snapshot of the geological and atmospheric processes which occurred in the recent past. While Martian dust particles of dimensions of ~1–10 μm remain in suspension indefinitely [
3], larger grains of dust and silt (diameter: 60 μm) are lifted, carried in Mars’s atmosphere, and eventually fall out and become part of the Martian soil. Larger grained materials, such as sand particles (up to a few hundred μm in size) may be moved by saltation [
4] and, upon breaking, may be lifted and carried over large distances by the atmospheric currents. Larger particles (such as coarse grained sand and hematite concretions up to 1–5 mm) may move only short distances by drag, but usually accumulate as lag deposits [
5,
6]. Consequently, the mineralogical and chemical characterisation of materials evaluated against their particle sizes, provides insights about the provenance of the material, leading to valuable geological inference.
Other than from direct measurements of grain dimensions of minerals within an outcrop (e.g., by Microscopic Imager onboard Mars Exploration Rovers Spirit and Opportunity), the best knowledge of the distribution of surface materials and their size frequencies arises from estimates of their physical properties from remote-sensing data. Martian global dust is characterised by low values of thermal inertia and high values of albedo.
Vice-versa, high values of thermal inertia and low albedo are typical of rock and duricrust [
2]. Intermediate values of these physical properties however cannot be readily interpreted [
7]. Therefore, previous authors (e.g., [
8]) have applied a supervised pixel classification method to measurements of thermal inertia and albedo acquired by the Thermal Emission Spectrometer (TES) on board the Mars Global Surveyor orbiter. Their work resulted in a map of surface materials subdivided into seven broad classes [
8,
9]. Of these, three were clearly characterised: (A) low TI-high albedo, corresponding to unconsolidated fines; (B) high TI-low albedo, corresponding to (mostly) sand, rock and bedrock; (C) high TI-medium albedo, corresponding to (mostly) duricrust. The remaining four classes (D–G) represented outliers in the values of TI and albedo, of uncertain interpretations. In a follow-up paper, the effects of horizontal mixtures and layering of two end-members, as well as slope, were modelled [
10], with the resulting 5° resolution thermophysical maps showing the prevalent effect of layering at mid-latitudes and in the polar regions, with less common and more localized horizontal mixing and slope effects.
As shown in
Table 1, combinations of thermal inertia (TI) and albedo can discriminate Martian materials ([
12–
33]).
In-situ investigations by rover instruments Mini-TES (Miniature Thermal Emission Spectrometer) and MI (Microscopic Imager) at Meridiani Planum (Opportunity; e.g., [
5,
6,
34,
35]) and Gusev Crater (Spirit; e.g., [
6,
35–
37]), and ChemCam at Gale Crater (Curiosity; e.g., [
38–
40]), have unveiled a rich variety of surface materials of variable size and state of aggregation. These include bedrock, rocks, pebbles, loose sediments, dust, compacted sediments, hollow-filling sediments, boulders and duricrust. Compositional differences also exist. For example, in Gale crater both a fine-grained soil of mafic composition and a coarse-grained soil of felsic composition were detected along a Curiosity transect, with the latter appearing to be locally derived [
39], and the former akin to global Mars mafic dust [
39,
41]. Mafic soils similar in composition to Martian dust were also observed by Spirit and Opportunity (e.g., [
42,
43]).
Thermal inertia and albedo provide a means of probing the physical properties of shallow subsurface materials which may be obscured by thermally thin coverings of dust and particulates. Examples of applications are: searching for obscured bedrock and thermally distinct units (e.g., floors of paleolakes [
44]); identifying near-surface ice in lobate debris aprons and pingoes (e.g., [
45]); and modeling regional ice stability and permafrost depth [
28]. In addition, thermophysical properties of the surface provide necessary inputs to GCM atmospheric circulation models [
46], and are still applied as an early criteria in landing site selection [
47]. Improving the understanding of the thermophysical properties of the Martian surface at the ~3 km resolution of the Thermal Emission Spectrometer (TES) remains scientifically valuable, despite the availability of thermal inertia values at ~100 m/pixel resolution from the Thermal Emission Imaging System (THEMIS) [
48]. Previous maps of thermophysical units (e.g., [
8,
49]) remain regularly cited and utilised by the Mars community. Important applications of these global maps are to provide a thermophysical context for spectral and visual observations of the surface and for point measurements (such as
in-situ investigations like those at the Phoenix landing site [
50]). Data resolution of 3 km is suitable for regional-scale analysis and interpretation, contributing to understanding the relationships between surficial geology, orbital mineralogical features [
51–
53], and visual morphologies (e.g., slope streaks [
54]). Identification of extensive kilometer scale features, such as thermally distinct preserved impact crater ejecta [
55], is also an ideal application of thermophysical maps.
This work evaluates unsupervised approaches to mapping thermophysical units. These approaches differ from previous works [
7–
9,
56–
58]. All previous thermophysical mappings determined divisions between thermal inertia and albedo units by manually applying thresholds to isolate the strongest peaks and highest pixel densities in the global distribution of thermal inertia and albedo values. Prior mappings were therefore sensitive to the globally-dominant mixtures of material types, particularly those with very high or low albedo and very low thermal inertia, such as bright fine dust, dark sand and bright ice (
Table 1).
Table 2 presents the values of Martian thermal inertia and albedo that would result in a unique interpretation of a single surface component material dominating the pixel. For example, if a pixel has an orbital thermal inertia of 2000 tiu and an albedo of 0.14, then it can be uniquely interpreted as a rock dominated surface within this classification scheme. The unambiguous values given in
Table 2 however occur in only ~33% of pixels in the global map. The remaining ~67% of the map is comprised of more than one surface component. It is not surprising that the majority of the Martian surface is not uniquely classified from orbital thermophysical data, given the heterogeneities discussed above and that each pixel in the thermal inertia and albedo maps encompasses a surface area of ~9 km
2. This means the separability of classes for a large majority of the Martian surface is inherently low, which makes it important to understand the uncertainties in existing class assignments and the strengths and weaknesses of the manual approaches used in previous studies. The uncertainties associated with manual classifications can be difficult to assess, and previous work assessing the uncertainties in the boundaries between thermophysical classes has been limited. Comparisons with alternative, less deterministic approaches that have been used successfully in terrestrial remote sensing, such as those examined here, can help to address these issues.
Given the complexities inherent with pixel classification of orbiter thermal data, and the importance of thermophysical maps for understanding geological processes on Mars, it is worthwhile to explore alternative classification methods to: (a) corroborate the results of earlier maps and classifications obtained by different techniques; (b) identify any areas of discrepancy; and, (c) interpret new insights deriving from examination of discrepancies. In this work, a comparison of different classification methods is presented, and their ability to improve the classification resolution of Martian thermophysical maps is analysed. The strengths and weaknesses of these new unsupervised approaches are explored, together with examples of applications to specific Martian areas to validate the methods. Some of the factors affecting the sensitivity of the classification algorithms are analysed. The unsupervised classification approach presented here can provide a powerful alternative to manual classification procedures, which are both deterministic in nature and potentially more time consuming. Here we derive and map the geographic distribution of seven Martian thermophysical units, which we discuss by detailed examination of the treatment of the dataspace, and compare them to thermophysical units published in earlier papers [
8,
9]. The unsupervised classification partitioning of thermophysical units offers new insights in the interpretation of Martian lithologies, stratigraphies, and geological reconstructions. Furthermore, while manual classification techniques are challenging when applied to more than two datasets, the techniques examined here can be utilized on N-dimensions without significantly increased difficulty for the user. This enables their use, for example, in the derivation of multi-dimensional groupings of Martian surface materials characterized through a combination of thermophysical data (albedo and thermal inertia) and mineralogy.
Our work utilizes the 2007 values of thermal inertia which were derived from a large number of seasonal brightness temperature observations from the Thermal Emission Spectrometer and were compiled into an updated thermal inertia map by [
10]. The algorithms applied in this work are well documented and are frequently applied to similar classification problems. Gaussian Mixture Models (GMMs) have been shown to provide a good approximation to many diverse data distributions [
59–
61]. Clustering techniques such as Iterative Self-Organizing Data Analysis Technique (ISODATA) and Maximum Likelihood (MAXLIKE) have been successfully applied to detect the most likely geographic origin of different strains of avian influenza [
62] and to characterise galaxy spectra from the Sloan Digital Sky Survey [
63], in addition to their usual application to problems related to identification of land cover from remote sensing data sets [
64,
65]. Two-band classification, for probing the relationship between the red and near infra-red channels, has been of enduring use for investigating surface characteristics across a range of sensors (including Landsat satellites, MODIS and AVHRR, [
66]). Many studies have focused explicitly on the relationship between these two bands (e.g., [
67–
70]), and developing indices for interpreting spectral and land information space. A few studies have also applied algorithmic classification techniques (both supervised and unsupervised) to Martian datasets, including: hyperspectral imagery and mineralogy data from orbital measurements [
71–
77] and ground-measurements by the rovers [
78]; terrain mapping and feature classification from elevation and surface roughness data [
79–
82] and visual imagery [
83]; and automated detection of impact craters [
84,
85]. The use of algorithmic classification in studies of Mars is increasing over time, however no previous study has applied algorithmic classification to mapping surface grain size and thermal behaviour in Martian thermal inertia and albedo data.
Thermal Inertia & Albedo
Martian thermal inertia and albedo have been discussed in detail by numerous authors and so only a brief description is given here. The data values used in this study are shown in
Figure 1. The thermal inertia of a material is a measure of its ability to conduct and store heat [
56,
86,
87]. Thermal insulators (materials with low thermal conductivity) have low thermal inertia, so they rapidly heat and cool at their surface due to their poor ability to distribute heat through conduction into their interior. Thermal inertia values of planetary surfaces are determined by a complex combination of particle size, bedrock outcrop and rock abundance [
10], and degree of cementation [
7]. Martian albedo is the fraction of incident visible to near-infrared (0.3–2.9 μm) solar radiation reflected by the surface [
88]. A combination of both thermal inertia and albedo are needed to understand the thermophysical properties of the surface. Thermal inertia correlates strongly with apparent grain size (particle size and degree of induration, e.g., [
12,
21]), whereas albedo correlates with mineralogy [
15]. Surfaces can display similar orbital thermal inertias, despite significantly different horizontal heterogeneity or vertical layering [
10,
89]. For example, both the Phoenix and Opportunity landing sites had an apparent thermal inertia of 200 tiu, but could be distinguished through different orbital albedo values of 0.2 and 0.15 respectively (Phoenix [
28,
50]; Opportunity [
35,
90,
91]). In addition, the subsurface thermal environment is controlled by both parameters, as albedo determines the degree to which solar insolation is absorbed by the surface (bright materials reflect more solar radiation and absorb less heat), and thermal inertia governs the distribution of that heat at depth [
7,
11].
Global observations by the Thermal Emission Spectrometer (TES), onboard the Mars Global Surveyor, were used to infer the albedo and thermal inertia of the Martian surface. Locally, surface values of these parameters acquired by landers ground-truth the satellite data [
92]. The sensing depth of TES is of the order of centimetres at infrared wavelengths, and of millimetres in the visual spectrum [
88]. Hence TES provides information only on the shallow subsurface. The resolution of TES is ~3 km/pixel [
93,
94]. The Martian surface is heterogeneous at this scale, therefore the thermal inertia derived for each pixel represents an integrated radiance and apparent brightness temperature of the surface materials within the satellite’s field of view [
1,
57].
On Mars, high thermal inertia materials (such as rocks) predominantly have lower albedo values than small grained, low thermal inertia materials (such as dust, sand). Bright high albedo regions on Mars indicate fine-grained surface dust, or ice [
95,
96]. Dark regions correspond to mixtures of sand, rocks, or duricrust (cemented sand sized grains) with smaller proportions of dust. The 3D histogram of Mars’s global albedo and thermal inertia contains four local maxima (
Figure 1). One peak is due to the contribution of bright and finely grained dust on the Martian surface (albedo 0.27, thermal inertia 55 J·m
−2·K
−1·s
−1/2 hereafter, tiu). The remainder include contributions from a range of materials of varying grain sizes, including sand, rocks, and duricrust. A scatterplot of global thermal inertia and albedo values on Mars (
Figures 1 and
2) reveals the complex relationship between these variables. The classification results presented in this work will be compared to these plots to determine their sensitivity to the major groupings within the two-dimensional albedo-thermal inertia dataspace.
2. Data
The procedure used to determine thermal inertia and albedo using the TES data and the technical details of the TES experiments have been widely published (for example, [
7,
8,
10,
56,
88]). Additional details of the data are given in
Appendix.
The albedo measurements used here were taken within Martian year MY24, which was characterised by minimal localised dust storm events [
97], and a lower dust optical depth (the atmosphere was more transparent) than in MY25 and MY26 [
98,
99]. The albedo values in MY24 should thus be the most representative of the mean surface materials, being least affected by scattering due to atmospheric dust. The variability in albedo values over MY24 to MY26 was less than ± 0.06 over the vast majority of the Martian surface [
10]. This albedo dataset differs from that utilized by [
8], which incorporated data from MY25 in the albedo map. The instrument uncertainty in albedo values is approximately ±0.01 [
88]. Orbital measurements comprise ~35% (global coverage) of the albedo map [
100]. Although observations comprise a small fraction of the albedo map, it overlaps well with the time period during which the thermal inertia mapping occurred.
The TES thermal data used to produce the 2007 nightside bolometric thermal inertia dataset [
10] (
Figure 3) were taken over MY24-27. Data affected by high dust opacity was removed. The nightside map is comprised predominantly of local night-time values, but includes some daytime values in the polar regions [
10]. Uncertainties are a combination of instrument measurement error, uncorrected atmospheric effects, and uncertainties in the thermal model. Computational uncertainty in night-time bolometric thermal inertia is estimated to be <10%, and the nightside map values used here may include another <10% error from the other datasets incorporated into the interpolation scheme used to derive thermal inertia (e.g., albedo and dust opacity) and physics not included in the model [
10]. The thermal inertia values are the medians of 36 maps of data obtained across the four Martian years, extending from ± 87° (due to the orbital inclination of the spacecraft). Observations constitute ~93% of the map, as it includes a larger number of seasonal observations and incorporate data from more Martian years than the albedo map used here [
101]. This thermal inertia dataset differs from that utilized by previous works [
7,
8] (
Figure 2), as the earlier model for deriving thermal inertia only computed values within the range of 0–800 tiu [
8]. Thermal inertia values > 800 tiu encompass 5.7% of the newer 2007 map, so only a small fraction of pixels have values outside the earlier (2005–2006) model, but these thermal inertia values indicate distinctive surface characteristics (
Table 1). Additionally, the greater geographic coverage of the 2007 thermal inertia map introduces values that may differ from the interpolated values in the earlier maps. Additionally, the thermal inertia dataset used here is more complete than that used in the thermophysical mapping of [
8] where observations constituted 60% [
57].
4. Results
The results of each classifier are shown as a scatterplot in
Figure 11 and summarised in
Table 5. The algorithmic classification methods applied in this work do not involve deterministic bias, yet it is essential to examine whether the results provide a reasonable partitioning of the dataspace. This is particularly important as in all methods the maximum number of classes was chosen prior to classification. An optimal classification has class boundaries closely aligning with features in the underlying distribution of values, so that the maximum amount of information is extracted from thermal inertia and albedo without the introduction of false patterns due to over-partitioning [
129,
148].
Comparing
Figure 3 with
Figure 6 reveals that the GMM results with five classes show the best alignment between class boundaries and the modal peaks in the data histogram—indicating that this classification scheme can perform very well for small numbers of classes. For seven classes, two modal peaks align with Gaussians (orange and yellow in
Figure 11). These Gaussians also align with peaks in the global thermal inertia and albedo histograms (
Figure 5). Other class boundaries from the seven Gaussian classification do not align well with the data structure. For example, class 5 (purple) appears to be assigned to reproduce the pixel counts in moderate to high thermal inertia, however some pixels at low thermal inertia and high albedo are assigned to the class when they would appear to more naturally fit within class 3 (fuchsia). In addition, the GMM algorithm did not converge for more than five classes when outliers were included within the dataset, indicating that it can be strongly skewed by data values even if the counts are very low. Finally, from
Figure 5, the EM of a GMM algorithm does a poor job of reproducing the total counts within the dataset, overestimating them by a factor of >2. This last point is also true even for the five Gaussian classification.
The ISODATA plus Maximum Likelihood classification also shows good alignment with the data structure for five classes (
Figure 6), although some class boundaries (e.g., the green class) are not aligned as well as they were in the GMM with five Gaussians. Unlike in the EM of a GMM classification, however, the alignment of classes with the data structure improves as the number of classes increases. For both the combined ISODATA + MAXLIKE and ISODATA on its own with seven classes, three of four modal peaks are aligned with a distinct class. The fourth modal peak is divided between the green and blue classes. This division appears somewhat arbitrary based on the data structure, however in the Discussion we will show that some coherent subdivisions like this can provide geologically useful information. The classes produced by the combined ISODATA + MAXLIKE algorithms are generally bimodal in thermal inertia and/or albedo (
Figure 7). This could suggest that a larger number of coherent classes can be identified, for example class 4 (yellow) incorporates a broad range of albedo values that may be better subdivided to remove the class bimodality. Alternatively, bimodality may reflect poor placement of the class boundary. In summary, for seven classes, the ISODATA assignment combined with the refinement undertaken by MAXLIKE shows greater sensitivity to the underlying data structure than the GMM (
Figure 11).
An additional measure of improved clustering is a decrease in the intra-cluster variance, which is analogous to increasing the similarity among the pixels assigned to the class [
149,
150]. From
Table 5 the classes produced by ISODATA and MAXLIKE generally show lower variance for either one or both parameters than those produced by EM of a GMM, suggesting that clustering could be improved by subdividing some of the GMM classes [
151]. Two GMM classes (red and orange) have lower variances in both albedo and thermal inertia. It is difficult to compare the variance of the GMM classes to those of the other classifiers however, given the significantly different placement of class boundaries by the GMM algorithm (
Figure 11). The Maximum Likelihood method is at least as good as ISODATA on its own for the yellow, green, fuchsia, and orange classes in
Table 5. One artifact of the ISODATA algorithm is the straight line delineation between classes due to the Euclidean distance measure used by the algorithm to partition the data [
119,
152]. These boundaries cut across contours and are not a natural division within the data. The Gaussian decision criteria applied by MAXLIKE and EM of a GMM produces elliptical classes [
153], which appear to perform better at aligning the class boundaries with the underlying pixel density.
In summary, the combination of ISODATA and MAXLIKE identifies the largest number of coherent classes which are aligned with the data structure. In addition, the intra-cluster variances from this combination are at least as good as those from ISODATA alone for seven classes. Thus the Maximum Likelihood classifier provides an improvement in sensitivity for delineating class boundaries over ISODATA on its own, and EM of a GMM. The discussion of the resulting spatial map and interpretation of the classes will therefore focus on the classification produced by the combination of ISODATA and MAXLIKE.
5. Discussion
The classification maps in
Figures 12 and
13 show a strong coherent spatial pattern of concentric class occurrence in both hemispheres, moving from the equator to higher latitudes. No spatial information was involved in the classification. This concentric class sequence corresponds to a general decrease in albedo moving outwards from the equator through the classes, accompanied by a general increase in thermal inertia. The trend is broadly due to decreasing surface dust coverage [
15] and fine grained sand as well as an increasing exposure of coarse grains, rocks [
14] and duricrust with distance from the equator [
21]. The classification of thermal inertia and albedo into fine dust-sized grains being dominant in the low latitudes, coarse sand in the mid-latitudes, and ice at the high latitudes, is consistent with previous thermophysical maps [
7,
8]. Generally, the global spatial patterns in surface materials are robust to the choice of classifiers applied in this work and consistent with previous works. From the pixel distance maps in
Figures 9 and
10, both the EM of a GMM and ISODATA + MAXLIKE classification have lower classification confidence in the polar regions. This is likely due to the spatial incoherence in the thermal inertia dataset, derived from the large variations in thermal inertia between the seasonal maps in this area [
10]. The moderate classification confidence at low latitudes appears to correspond to the placement of the orange-yellow-fuchsia class boundaries in
Figure 11. In general, the classification confidence in
Figure 9 shows an inverse relationship with thermal inertia—with lowest classification confidence occurring in regions of high thermal inertia (
Figure A1). The high classification confidence regions of
Figure 10 correspond to areas of low thermal inertia and high albedo (
Figures A1 and
A2).
It is difficult to directly compare this work with previous thermophysical classifications due to the differences in the thermal inertia and albedo datasets (
Figure 2).
Figure 14 compares the manual classification of [
8] to the output of the automated classification algorithms used in this work, applied to the same older datasets. In the manual classification, the three major modal peaks are each encompassed within a class (blue, yellow and red), however there are divisions for the rest of the data which do not appear aligned with the underlying data structure. Furthermore, several of the classes encompass a broad range of thermal inertia and albedo values which corresponds to a broad range of surface materials. For example, the fuchsia class encompasses materials with thermal inertia values ranging from 50–400 tiu, corresponding to particles sizes ranging approximately from 5 μm to 2 cm [
16,
18] (
Table 1) and possibly therefore environments with varied erosional histories. From
Figure 14, the 5 class classification produced by ISODATA and Maximum Likelihood on the older data set is the only one which does not place a boundary cutting across one of the local maxima. The 7 class automated classification—using the same number of partitions as [
8]—places several of the class boundaries in significantly different locations to the manual classification, and subdivides two of the modal peaks (blue and green; yellow and red). These subdivisions appear artificial from the viewpoint of the global data structure, however some subdivisions may be useful for geological mapping as discussed below.
The most recent mapping of thermophysical units using the same thermal inertia dataset as in this work was done by [
9,
57]. The most significant difference between the class boundaries in that work and previous mapping by [
8] occur in the boundary between units F and G, which is placed by [
9] around albedo ~0.24, and thermal inertia > 403 tiu. The analogous classes in the mapping of this study are classes 5 (purple) and 7 (red) (
Table 6). The boundary between these class occurs around a similar thermal inertia range of >400 tiu, but a higher albedo of ~0.3 (
Figure 11), and cuts across the 5.5 × 10
4 count contour in
Figure 11. The boundary between the class F-G boundary in [
9] is more sensitive to the drop in counts observed at low albedo and high thermal inertia.
The sensitivity of the ISODATA and MAXLIKE algorithms clearly varies both with the dataset, and with the number of classes partitioned. An important result of this work is the identification of a number of factors which affect the sensitivity of these unsupervised clustering algorithms. It is essential to have an understanding of these factors prior to the application of these algorithms [
154]. The assignment of pixels by the combined ISODATA and MAXLIKE algorithms is shown in
Figure 6 for 5–14 classes, to compare the general behaviour of the algorithms for varying N. As the number of classes increases, more classes are generally assigned between albedo < 0.3 and thermal inertia < 700 tiu, and the boundaries of the classes are refined to better align with the underlying contours. These values encompass ~93% of the map, and correspond to surfaces dominated by dust, fine-coarse sand, indurated sand and duricrust, pebbles, and mixtures of these components (
Table 1). For each value of N, at most 1/3 of the total classes are assigned to encompass values outside the aforementioned albedo and thermal inertia range, consistent with the low density of data points in that region. For some values, the resulting class boundaries appear to cross contours and split the modal peaks, thereby creating an artificial segregation of the dataspace (for example, N = 12). This can also be observed when comparing the behaviour of the algorithms on slightly different datasets. For example, in
Figure 14 (older thermal inertia dataset) the low albedo/medium thermal inertia peak is clearly isolated for 5 and 6 classes, but subdivided for 4 and 7 classes. The subdivision occurs on the basis of thermal inertia, with the class boundary cross-cutting a range of albedo values. In
Figure 11 (newer thermal inertia dataset), the similar low albedo/medium thermal inertia peak is again subdivided into two classes but on the basis of albedo, with the boundary cross-cutting a range of thermal inertia values. Although classification validity should be primarily determined by groundtruthing the map and comparing to independent datasets, the sensitivity of the algorithms to the dataspace clearly varies. These results illustrate the importance of carefully examining the partitioning of the dataspace by algorithmic classifiers, as their sensitivity is affected both by the structure of the underlying dataspace, and the data range of the variables in the multivariate classification problem (
Figure 8).
The albedo and thermal inertia data structure shown in
Figure 1, indicates that four classes can encompass the major peaks within the data, with 2–3 further classes being useful for encompassing the less frequent values in the data, e.g., (i) high thermal inertia with high albedo; (ii) mid-high thermal inertia with low-mid albedo. This suggests that 6–7 classes are sufficient to capture the major structure within the two-dimensional dataspace. However, cluster validity also depends on application [
154,
155], and a higher number of divisions can sometimes be justified if they provide scientifically useful information. The application of thermal inertia and albedo data in this work is to remotely map Martian surface materials and surficial geology, and for this purpose a higher number of divisions can enable more information to be extracted from the dataspace. From
Figure 7, class 6 (orange) includes two peaks at high albedo (0.24 and 0.27) which correspond to a single peak in low thermal inertia (~55 tiu). From known characteristics of Martian surface materials, these pixels are likely surfaces dominated by a mantle of fine-grained dust < 10 μm across [
16] which is dominating the apparent thermal inertia (
Table 1). Given that complete and optically thick dust coverage results in an albedo of >0.27 [
13], the two peaks in albedo within this class may indicate sub-pixel dust free regions which would be of geologic interest for spectral studies. Hence in this context, subdividing this class to produce a larger number of partitions in the dataspace would provide a more useful interpretation of surface materials.
Figure 15 illustrates that some subdivisions of classes on the basis of either thermal inertia or albedo can be scientifically useful. For example, when the thermophysical classes are being applied to discriminate surfaces mantled by Martian fines and map pixels with a similar sub-pixel coverage of bright Martian dust or dark sand (
Table 1). One region of difference in the spatial map of the thermophysical classes produced by [
8] and that produced in this study is within Utopia Planitia. The thermal inertia values for this region (and albedo to a lesser extent) were typically higher in the newer data sets than in the data used by [
8]. From
Figure 15, the manual classification identifies much of the expanse of Utopia Planitia as being dominated by one major surface material (class B; yellow). In the Maximum Likelihood classification using the newer versions of thermal inertia and albedo, class 2 (green) shares a similar outer boundary to class B, however the central region of Utopia is occupied by another surface material class—class 1 (blue). The boundary between class 1 and class 2 is somewhat correlated with the geological contact mapped by [
156], separating the Vastitas Borealis Formation (VBF) “mottled” (interior) and “knobby” (outer) regions. Many surface morphologies in this region are indicative of periglacial modification of the surface and a loss of past volatiles (e.g., [
157,
158]). The spatial correlation between class 1 and the “mottled” VBF unit highlights the usefulness of the subdivision between the green and blue classes in
Figure 11.
5.1. Assessment and Validation by Comparison to Surface Features
Table 6 provides an interpretation of the thermophysical units defined in this study, by comparing the data values within each class with the known properties of Martian materials summarised in
Table 1. To provide some groundtruthing of the map and the delineation of class boundaries, the interpretations are compared to surface features and geologic units.
At least four of the seven thermophysical classes defined here (classes 1–4) were sampled by landers and rovers on the Martian surface (
Figure 16) and hence can be groundtruthed. The results are generally consistent with the interpretation of the classes given in
Table 6, with the possible exception of class 3 (fuchsia).
Class 1 (blue) terrain was sampled at Meridiani Planum by the Opportunity Rover, where the surface was found to be predominantly dust free with an albedo of 0.12 [
48,
159]. The terrain at Meridiani is dominated by basaltic sand and grey spherical hematite grains, millimetres in diameter [
14]. Sand organised into dunes was also observed by Opportunity at Endeavour crater [
160]. The high thermal inertia materials observed in the landscape were sparse rocks (400–1100 tiu) and duricrust [
9,
161], consistent with the interpretation in
Table 6.
From
Figure 11 and
Table 6, class 2 (green) appears to be similar to class 1, but with a higher coverage of bright dust (higher albedo) and an overall smaller fraction of fines (low thermal inertia materials). Class 2 terrain was sampled by Pathfinder at Ares Valles, where the surface was found to be dominated by fine-grained drift material and sand [
162], with ~16% of the observed area containing semi-rounded pebbles and larger rocks [
163]. Dark rocks were found to have discontinuous coatings of bright red dust, raising their albedo [
164]. The interpretation of grain sizes within class 2 terrain is in agreement with the fine component observed on the surface [
165], as this dominates the orbital thermal inertia [
10]. The Pathfinder site had the highest rock abundance of all of the landing sites [
1], however there are no pixels within class 5 that have an orbital thermal inertia consistent with pebbles or larger rocks, due to the extensive sub-pixel coverage of the fine component.
Class 3 (fuchsia) terrain was sampled by the Phoenix lander in eastern Arcadia, where ice-rich soil was obscured beneath drift and dust deposits [
28]. The interpretation of class 3 materials in
Table 6 is consistent with the observed fine component.
Class 4 (yellow) terrain was sampled at Gusev crater by the Spirit Rover, with the surface found to be dominated by a <1 mm thick bright dust covering [
48] over pebble-rich terrain and drift deposits (particles < 100 μm) [
92,
166]. Similar surface materials were observed by the Viking 1 lander in Chryse Planitia [
20], consistent with
Figure 16. Class 4 terrain was also sampled by Viking 2 at Utopia Planitia, with the surface found to be dominated by smooth fractured crusts (fragments 0.2–1.25 cm) with a fine component of crusty to cloddy material between the cracks, some rocks (centimetres to metres across), and little drift (<10 μm) material [
20]. Both Viking 2 and Spirit observed a strong presence of duricrust (200–300 μm cemented grains) [
2,
92,
166], consistent with the interpretation in
Table 6.
In summary, the differences between the algorithmically defined classes in orbital thermal inertia and albedo data, have translated into observed differences on the Martian surface in the relative fractions of difference end-member materials.
Martian sand dunes predominately larger than 1 km
2 are being mapped from THEMIS, MOC and CTX imagery [
167]. The dune boundaries can therefore be intersected with the thermophysical map to measure the overlap with different classes and test the interpretation of sand-dominated surfaces. From the ~10
6 km
2 area of mapped dune coverage [
168], ~86% of total dune area (normalized by class surface area) was found to occur in classes 1 (blue) and 5 (purple), shown in
Figure 17. This indicates a strong correlation between large dune occurrence and surfaces interpreted as being dominated by coarse dark sand > 100 μm in
Table 6. Although classes 2–4 and 6 incorporate fine-sand they have little mapped dune area, which is likely related to the required grain size for saltation driving the formation of dunes [
40]. An example of a dune field in class 1 terrain is the Olympia Undae dune field [
169] shown in
Figure 18.
A number of impact craters with diameter over 50 km are distinguished in the thermophysical map of
Figure 13. These craters can be identified by concentric circular structures of thermophysical units that contrast with the units dominating the surrounding terrain. This is consistent with observations of distinct high thermal inertia rims and impact ejecta surrounding many Martian craters [
55,
170]. Three interesting impact craters are shown in
Figure 19. The interior of Korolev crater shows ice-related morphologies on the interior mound [
171] and spectra consistent with a water ice composition [
172]. From
Figure 19, classes 5 and 7 infill Korolev crater and are correlated with the observed exposures of ice [
173], consistent with their interpretation in
Table 6.
McLaughlin crater shows evidence of a past lacustrine environment, with channels, possible debris aprons, and spectral evidence for clays and carbonates on the crater floor [
175]. These features occur in the region of classes 1 and 2 terrain within the crater in
Figure 19. The distribution of these materials within McLaughlin crater suggests there may be a relationship between the possibly once volatile-rich materials observed in this region and the classes 1 and 2 terrain. Additionally, other expanses of these terrains in the northern hemisphere (in Utopia and Acidalia Planitia) are correlated with extensive glacial and periglacial morphologies (e.g., [
157,
176]), and are modelled to have had the highest deposition of volatiles [
177] during moderate obliquity (25°–35°) within the last < 10 Ma [
178]. Class 4 may also be associated with subsurface volatiles, as it dominates the region visited by the Phoenix Lander in the northern arctic (
Figure 16). Similarly, the interior of Lomonosov crater shows a concentric distribution of classes 1, 2, and 4 associated with its central peak and crater floor. Ice-cemented soil in Lomonosov’s interior has been speculated from thermal observations [
88], and observations of seasonal water frost in the interior [
179]. The distribution of class 5 material on the northern wall of Lomonosov may be associated with the observation of pure coarse CO
2 frost in this region [
180], consistent with the interpretation given in
Table 6 and the occurrence of class 5 in the northern polar regions.
The delineation of major geologic structures such as Valles Marineris, Olympus Mons, and a number of large impact craters in the thermophysical map suggests a broad global correlation between the classes and Martian surficial geology. The Valles Marineris canyon system is shown in
Figure 20. The major canyons are outlined in the geologic map by a single geologic unit (purple) and predominantly in-filled by two distinct geologic units (pale yellow and blue). In the thermophysical map, the major canyons and the western labyrinth of valleys are clearly defined by a boundary of predominately class 5 (purple), and are in-filled primarily by class 1 (blue) and class 2 (green). Several of the geologic boundaries, for example, the boundary between the low viscosity lava flows of the “ridged plains unit” and the volcanic flows of the “syria planum formation” [
156,
181], are also echoed in the thermophysical map. This suggests that the map may be used to resolve different types of lava flows. Furthermore, the boundaries between units in this region of the thermophysical map are not clearly identified in either the albedo map or thermal inertia map alone. Hence the division of thermophysical classes in this region provides additional information more than either dataset on its own, and is broadly correlated with boundaries of geologic units.
Martian terrain is categorized into three broad periods of geologic history based on impact crater densities, reflecting the age of the surface since its last significant reworking. The broad age bands of Noachian (surface ages 4.1–3.7 Ga), Hesperian (3.7–3.0 Ga) and Amazonian (3.0 Ga-present) [
182], are each characterised by different surface processes and hence a weak relationship between surface age and grain size may be expected. Noachian surfaces, being the oldest, are heavily cratered and degraded. During this period the surface experienced extensive liquid water erosion through major flooding, such as the events that carved Valles Marineris [
183] and other valley networks [
184,
185], and likely had long-term standing water to produce the observed sedimentary layers (e.g., [
186,
187]) and clay minerals [
188,
189]. Hesperian surfaces also experienced water activity, with outbursts of water erosion forming the outflow channels [
190] and acidic water-rock interaction leading to the sulphate mineralogy [
188]. Volcanic activity was frequent during this period, with extensive lava plains covering the surface [
191]. The Amazonian epoch is characterized by significantly less water and lava erosion [
192], with predominantly water poor environments but extensive glacial/periglacial activity [
193]. The division of terrain from each of the three geologic epochs into the seven classes is shown in
Figure 21. Although each class is comprised of terrain of all surface ages, there are some clear relationships between surface age and thermophysical class, when corrected for surface area. For example, Amazonian terrain predominately occurs in class 5 (purple) and 7 (red), consistent with the interpretation of surface ice in these classes obscuring the cratering record (
Table 6). Class 1 (blue) and 2 (green) terrain are dominated by fines (
Table 6) and are predominantly Noachian aged, consistent with the erosive action of liquid water and impact gardening increasing the fraction of fines and drifts on Noachian surfaces. No classes have a particular preference for Hesperian aged terrains.
5.2. Future Work
The above comparison of the thermophysical classes derived by the combination of ISODATA with MAXLIKE to independent datasets on Martian surface morphologies and geology, indicates that the divisions between classes translate into meaningful information on Martian surface materials. These results suggest that the unsupervised classification approach presented here can provide a powerful alternative to manual classification procedures, with new insights into Martian surficial geology.
Future work will potentially incorporate additional datasets into the classification (such as mineral maps, dayside thermal inertia, and elevation), and examine the datasets to determine the optimal number of classes for mapping thermally (and potentially mineralogically) distinct surface materials. It was noted in this work that outliers significantly affect the performance of the GMM algorithm. Hence although the algorithm appeared to have some significant limitations, it is possible that the performance could be improved by further restricting the dataspace to only include values that have a high frequency. This comes, however, at the cost of losing information on certain Martian surface materials. For example, high thermal inertia > 1000 tiu only comprise a small fraction of the dataset, but indicate surfaces with significant pebble to boulder coverage (rocks larger than ~5 mm [
14,
23,
24,
26]). In addition, the GMM has difficulty in reproducing pixel counts, typically overestimating by a factor of >2. Given the analysis within this work, the combined use of ISODATA and MAXLIKE is recommended for any future work on unsupervised partitioning of these datasets.