Ocean-Surface Heterogeneity Mapping (OHMA) to Identify Regions of Change

: Mapping heterogeneity of the ocean’s surface waters is important for understanding biogeographical distributions, ocean surface habitat mapping, and ocean surface stability. This article describes the Ocean-surface Heterogeneity MApping (OHMA) algorithm—an objective, replicable approach that uses hypertemporal, satellite-derived datasets to map the spatio-temporal heterogeneity of ocean surface waters. The OHMA produces a suite of complementary datasets—a surface spatio-temporal heterogeneity dataset, and an optimised spatio-temporal classiﬁcation of the ocean surface. It was demonstrated here using a hypertemporal Sea Surface Temperature image dataset of the North Atlantic. Validation with Underway-derived temperature data showed higher heterogeneity areas were associated with stronger surface temperature gradients, or an increased presence of locally extreme temperature values. Using four exploratory case studies, spatio-temporal heterogeneity values were related to a range of region-speciﬁc surface and sub-surface characteristics including fronts, currents and bathymetry. The values conveyed the interactions between these parameters as a single metric. Such over-arching heterogeneity information is virtually impossible to map from in-situ instruments, or less temporally dense satellite datasets. This study demonstrated the OHMA approach is a useful and robust tool to explore, examine, and describe the ocean’s surface. It advances our capability to map biologically relevant measures of ocean surface heterogeneity. It can support ongoing efforts in Ocean Surface Partitioning, and attempts to understand marine species distributions. The study highlighted the need to establish dedicated spatio-temporal ocean validation sites, speciﬁcally measured using surface transits, to support advances in hypertemporal ocean data use, and exploitation. A number of future research avenues are also highlighted.


Heterogeneity, and the Ocean-Surface
The world is heterogeneous and non-equilibrial [1]. This heterogeneity is underpinned by complexity and/or variability, as properties change over space and/or time [2]. The ocean structure, for example, is shaped by changes in water properties (e.g., temperature, salinity, nutrient content) and water movement in space and time. When water masses interact, they create a 4-dimensional architecture which is more structurally complex, and spatio-temporally variable. These confluence zones, or discontinuities, offer a wide variety

•
Being univariate in nature (e.g., a dataset of only SST, or only Chl-a measurements); • Containing a set of time slice images which are precisely co-registered (i.e., a pixel in one time-slice image, has an equivalent co-located pixel in every other time-slice); • Exhibiting radiometric consistency between images (i.e., they are measured using the same sensors or inter-validated sensor systems, and exhibit a degree of normalisation between time-slices), and; • Being comprised of "frequent, equal spaced observations" [27] which are discrete over time (for example daily image data acquired at weekly intervals, or composite image data composed of daily image data).
Today, there are a growing number of hypertemporal archives available for oceanresearchers to exploit. Scarrott et al. noted that much of the development in hypertemporal approaches had occurred for land applications [26], with a range of methodological approaches now available for ocean research to exploit this data resource. These include approaches to mapping heterogeneity using hypertemporal datasets (e.g., [11,28]). However, they noted that there were challenges to overcome in using those approaches on ocean-surface datasets [26]. For example, a divergence-guided Iterative Self-Organising Data Analysis (ISODATA) clustering approach [29,30] has been used in the Landscape Heterogeneity Mapping (LaHMa) algorithm, to quantify landscape heterogeneity [28]. This approach uses expert knowledge of the user, to determine the optimal number of clusters within the data, which sets the upper bounds of a cluster ensemble. However, this subjective step is based on experience, or access to pre-existing landscape information. For ocean applications, the approach needs to be enhanced to make it data-driven.
This study develops the Ocean-surface Heterogeneity MApping (OHMA) approach for ocean surface waters, through applying an adapted LaHMa algorithm to a hypertemporal dataset of Sea Surface Temperature (SST) data. The validity of the methodology was examined using Underway-derived in-situ estimates of spatio-temporal heterogeneity in temperature. This validation, coupled with a preliminary examination of the North Atlantic using STH and an initial set of other ocean parameters, was then used to provide guidance on how a STH dataset should be interpreted.

Sea Surface Temperature (SST) Dataset
Satellite-derived Sea Surface Temperature (SST) data are one of the most t extensive and consistent satellite-derived datasets available, with SST data av over 30 years. Using a combination of infrared and microwave sensor inputs, sor-derived level 4 SST products can provide inter-comparable SST values at d lutions. The data selected to develop and demonstrate the OHMA methodolog tracted from the GHRSST Level 4 MUR Global Foundation SST Analysis produc 2.0, (available from the Physical Oceanography Distributed Active Archive Cen ated by the NASA Jet Propulsion Laboratory). It is globally gridded at ~0.1° res merging data from the MODIS, AMSR-E, and AVHRR sensors [34] to limit the cloud coverage. It includes information for quality assessment and pixel scree SST Image data were extracted at 7-day intervals from 1 January to 31 Decem and subset to the NAA area of interest. The final dataset qualifies as hypertempo it is univariate (SST), is composed of frequent, equal spaced observations whic crete over time, and contains pixels that are precisely co-registered. Furthermo taset exhibits the appropriate degree of normalisation between time-slices. All S in the data product are derived from multiple-sensors which have been inter-

OHMA Processing 2.2.1. Sea Surface Temperature (SST) Dataset
Satellite-derived Sea Surface Temperature (SST) data are one of the most temporally extensive and consistent satellite-derived datasets available, with SST data available for over 30 years. Using a combination of infrared and microwave sensor inputs, multi-sensorderived level 4 SST products can provide inter-comparable SST values at daily resolutions. The data selected to develop and demonstrate the OHMA methodology were extracted from the GHRSST Level 4 MUR Global Foundation SST Analysis product, Version 2.0, (available from the Physical Oceanography Distributed Active Archive Center, operated by the NASA Jet Propulsion Laboratory). It is globally gridded at~0.1 • resolution by merging data from the MODIS, AMSR-E, and AVHRR sensors [34] to limit the impact of cloud coverage. It includes information for quality assessment and pixel screening. The SST Image data were extracted at 7-day intervals from 1 January to 31 December 2011, and subset to the NAA area of interest. The final dataset qualifies as hypertemporal in that it is univariate (SST), is composed of frequent, equal spaced observations which are discrete over time, and contains pixels that are precisely co-registered. Furthermore, the dataset exhibits the appropriate degree of normalisation between time-slices. All SST values in the data product are derived from multiple-sensors which have been inter-validated, with all raw values passing through an optimal interpolation using wavelets to produce the final distributed values.

Figure 2.
Deriving a surface spatio-temporal heterogeneity (STH) dataset, and ancillary products (in green) from level-4 SST imagery. Orange arrows indicate the pre-processing elements of the chain. Red arrows indicate the OHMA processing chains through to the outputs (highlighted as green rounded boxes).

OHMA Implementation
The Ocean-surface Heterogeneity MApping (OHMA) process is detailed in Figure 3. It is automated once the input data are selected. Outputs include (i) a single STH dataset, (ii) an optimal cluster image dataset which best represents the variability in the data, and (iii) the temporal signatures associated with that cluster image. Steps 1 and 4 follow the LaHMa algorithm [28] and amendments noted by Ali et al. [11]. Deriving a surface spatio-temporal heterogeneity (STH) dataset, and ancillary products (in green) from level-4 SST imagery. Orange arrows indicate the pre-processing elements of the chain. Red arrows indicate the OHMA processing chains through to the outputs (highlighted as green rounded boxes).

OHMA Implementation
The Ocean-surface Heterogeneity MApping (OHMA) process is detailed in Figure 3. It is automated once the input data are selected. Outputs include (i) a single STH dataset, (ii) an optimal cluster image dataset which best represents the variability in the data, and (iii) the temporal signatures associated with that cluster image. Steps 1 and 4 follow the LaHMa algorithm [28] and amendments noted by Ali et al. [11].
The hypertemporal SST image dataset was processed through a series of unsupervised classification runs, using an Iterative Self-Organising Data Analysis (ISODATA) approach in Erdas IMAGINE 2018 software. These were executed with each run generating a predefined number (from 10 to 100) of output clusters. All ISODATA clustering runs were implemented with the number of iterations set to 50, the convergence threshold set to 1.0, and zero values classified. The ISODATA runs also produced a signature file to accompany each cluster dataset, containing statistical summaries of the cluster values arranged perimage over time (signatures). For each classification, divergence and Jeffries-Matusita separability measures [35][36][37] were calculated between all possible pairs of signatures, to assess their separability. The minimum and average divergence, and Jeffries-Matusita, indices were then extracted to represent each run. The hypertemporal SST image dataset was processed through a series of unsupervised classification runs, using an Iterative Self-Organising Data Analysis (ISODATA) approach in Erdas IMAGINE 2018 software. These were executed with each run generating a pre-defined number (from 10 to 100) of output clusters. All ISODATA clustering runs were implemented with the number of iterations set to 50, the convergence threshold set to 1.0, and zero values classified. The ISODATA runs also produced a signature file to accompany each cluster dataset, containing statistical summaries of the cluster values arranged per-image over time (signatures). For each classification, divergence and Jeffries-Matusita separability measures [35][36][37] were calculated between all possible pairs of signatures, to assess their separability. The minimum and average divergence, and Jeffries-Matusita, indices were then extracted to represent each run.
The key enhancement to the LaHMa approach is the use of the Jeffries-Matusita measure. In Erdas IMAGINE, Jeffries-Matusita distance (Jij) is calculated as the separability of a pair of probability distributions, using the formula [35]: where: where: i and j = two temporal signatures (classes) being compared, Ci = the covariance matrix of signature i, Cj = the covariance matrix of signature j, µi = the mean vector of signature i, µj = the mean vector of signature j, ln denotes the natural logarithm function, and T denotes the transpose of a matrix. The square root of Jeffries-Matusita, which is asymptotic towards √ 2, (Equation (1)), gives a plateau at ~1.414 [35]. This plateau was exploited to remove subjectivity from the separability analysis step. The key enhancement to the LaHMa approach is the use of the Jeffries-Matusita measure. In Erdas IMAGINE, Jeffries-Matusita distance (J ij ) is calculated as the separability of a pair of probability distributions, using the formula [35]: where: where: i and j = two temporal signatures (classes) being compared, C i = the covariance matrix of signature i, C j = the covariance matrix of signature j, µ i = the mean vector of signature i, µ j = the mean vector of signature j, ln denotes the natural logarithm function, and T denotes the transpose of a matrix.
The square root of Jeffries-Matusita, which is asymptotic towards √ 2, (Equation (1)), gives a plateau at~1.414 [35]. This plateau was exploited to remove subjectivity from the separability analysis step.
The separability data for all runs were then plotted against the number of classes generated by each run (Figure 4). The plot was interpreted objectively using pre-defined criteria, to select the optimal number of clusters to represent the variability in the processed hypertemporal SST image dataset. The pre-defined criteria were as follows:

•
The number of clusters selected was the lowest number which satisfied the separability criteria; • The cluster number selected featured a positive peak in both the minimum and average Divergence separability measure; • The cluster number selected features a Jeffries-Matusita value of 1.414 or above (denoting the highest level of separability between all clusters [38]).
• The cluster number selected features a Jeffries-Matusita value of 1.414 or above (denoting the highest level of separability between all clusters [38]).
This coupling of the divergence peak-seeking approach [28,29] to an analysis of Jeffries-Matusita characteristics (Steps 2 and 3 in Figure 3) is the key aspect for ocean science applications. Only when the minimum Jeffries-Matusita separability is at 1.414 or above, are coincident peaks in divergence acceptable as indicators of the upper bounds of the OHMA cluster boundary ensemble ( Figure 4). The process of optimising the number of clusters and producing the cluster ensemble is thus fully automated, without need for expert knowledge of the region, and meeting the ocean science criteria outlined in [26].

Figure 4.
A graphical illustration of the separability analysis which determined the optimal number of data clusters. The estimates of all four separability measures between clusters are plotted against the number of clusters produced by the ISODATA clustering. Note that (mean divergence)/100 is shown for visualisation purposes. Highlight (i) in orange shows the 22 clusters which would have been selected using de Bie et al.'s LaHMa approach [28], solely looking for coincident peaks in divergence measures, without consideration of the Jeffries-Matusita separability measure. Highlight (ii) in green shows the location of a peak which satisfies both the divergence peak criteria, and the Jeffries-Matusita criteria needed for the more objective OHMA approach.
For each cluster dataset image, from 10 up to the optimal number of clusters (in this case 53 clusters), the cluster boundaries were extracted using ESRI ArcMap Version 10.5. As per the LaHMa approach [11,28], each output raster was first converted to a vector polygon, maintaining the integrity of the pixel boundaries by de-selecting the option to "simplify polygons". These polygons were then converted into polylines, with each feature given a value of 1. The polyline file was converted to a 1/0 raster image with a pixel resolution 1/6th the spatial resolution of the original SST imagery. These 44 1/0 image datasets were then summed, to produce a single raster image with raster values from 0 to Figure 4. A graphical illustration of the separability analysis which determined the optimal number of data clusters. The estimates of all four separability measures between clusters are plotted against the number of clusters produced by the ISODATA clustering. Note that (mean divergence)/100 is shown for visualisation purposes. Highlight (i) in orange shows the 22 clusters which would have been selected using de Bie et al.'s LaHMa approach [28], solely looking for coincident peaks in divergence measures, without consideration of the Jeffries-Matusita separability measure. Highlight (ii) in green shows the location of a peak which satisfies both the divergence peak criteria, and the Jeffries-Matusita criteria needed for the more objective OHMA approach. This coupling of the divergence peak-seeking approach [28,29] to an analysis of Jeffries-Matusita characteristics (Steps 2 and 3 in Figure 3) is the key aspect for ocean science applications. Only when the minimum Jeffries-Matusita separability is at 1.414 or above, are coincident peaks in divergence acceptable as indicators of the upper bounds of the OHMA cluster boundary ensemble ( Figure 4). The process of optimising the number of clusters and producing the cluster ensemble is thus fully automated, without need for expert knowledge of the region, and meeting the ocean science criteria outlined in [26].
For each cluster dataset image, from 10 up to the optimal number of clusters (in this case 53 clusters), the cluster boundaries were extracted using ESRI ArcMap Version 10.5. As per the LaHMa approach [11,28], each output raster was first converted to a vector polygon, maintaining the integrity of the pixel boundaries by de-selecting the option to "simplify polygons". These polygons were then converted into polylines, with each feature given a value of 1. The polyline file was converted to a 1/0 raster image with a pixel resolution 1/6th the spatial resolution of the original SST imagery. These 44 1/0 image datasets were then summed, to produce a single raster image with raster values from 0 to 44. The raster pixels were then aggregated using a cell factor of 6, and selecting the maximum value technique. This was the approach used by Ali et al. [11] to produce the raster STH image dataset of the same spatial resolution and spatial completeness as the input data. The output of the process is effectively a per-pixel count of the number of cluster boundaries which lie across that pixel, over the course of an objectively defined range of classification runs. The pixel value in a STH dataset can be defined as follows: where: b = 1, pixel (p) contains a cluster boundary, b = 0, pixel (p) does not contain a cluster boundary, STH p = surface spatio-temporal heterogeneity in a pixel (p), k u = optimal number of clusters to represent data variability, and k l = lowest number of clusters to represent data variability.
Within each cluster image, these boundaries separate groups of clustered pixels, grouped on the basis of their temporal profiles statistical characteristics. Pixels with higher STH values more dependably separate regions with different parameter (in this case temperature) profiles over the analysed time period. The single-image dataset contains spatial, temporal and difference information, with the pixel values giving estimates of spatio-temporal heterogeneity (STH). A fully detailed summary of the development of the OHMA algorithm adaptations is provided in [39]. . Temperature measurements were taken at 3 to 6 metres depth using a thermosalinograph (Seabird, SBE21 with temperature sensor SBE38) on board the RV Celtic Explorer [41,42], and RV Celtic Voyager [43,44]. Each cruise is treated as distinct, as the sensors may have been swapped out intermittently to be calibrated. However, the heterogeneity of absolute values within individual cruises can be considered to be reliable for validating the STH values. For the purpose of this study, a transit was considered to be the sequence of temperature measurements acquired by the vessel's thermosalinograph, while the vessel travels through the area of an STH image pixel.

Spatial and Temporal Bias Reduction to Derive Comparable Heterogeneity Measures
To overcome the spatial and temporal sampling bias in the in-situ transit data, a three-stage bias-reduction approach was taken. Firstly, a grid with STH pixel extents was created. Only pixel areas which featured at least one measurement in all four quarters of 2011 (Q1 = 1 January to 31 March; Q2 = 1 April to 2 July; Q3 = 3 July to 30 September; and Q4 = 1 October to 31 December) were selected as potentially acceptable validation sites. Pixels adjacent to the coastline and in close proximity to each other were discarded, to reduce (i) the influence of mixed pixels and (ii) any potential for spatial autocorrelation. Of the 3677 Underway-sampled pixels, 14 pixels had acceptable data coverage of each quarter of 2011 ( Figure 5). These validation pixels all lie in the CSR around the coastline of Ireland where the Marine Institute vessels typically operate, and had STH values that ranged from 0 to 29 (excluding coastal or poor SST data artifact pixels). This compares to a range of 0 to 39 for the full STH dataset (the NAA region). Each validation pixel's longitude and latitude is 0.102 • , giving a sampling area between approximately 81.4 km 2 to 74.7 km 2 , depending on latitude.
Remote Sens. 2021, 13, x FOR PEER REVIEW 9 of 35 a range of 0 to 39 for the full STH dataset (the NAA region). Each validation pixel's longitude and latitude is 0.102°, giving a sampling area between approximately 81.4 km 2 to 74.7 km 2 , depending on latitude. Sampling frequency, coupled with the high variability in vessel speed and area covered, increases the potential for a spatial sampling bias. In an effort to reduce this, stage two sub-divided each of the 14 selected pixels (spatial resolution ~0.102°) into a 25 × 25 grid of sub-pixels (spatial resolution 0.004°). The mean temperature of each sub-pixel was obtained, with the full pixel heterogeneity measures calculated from the sub-pixel values. In four cases, a vessel transit gathered less than five measurements across a pixel. In these cases, to maximise the number of measurements available, ten measurements before and after the research vessel crossed the adjacent pixel boundary were used. These were subpixellated, and averaged with the central measurements to calculate transit heterogeneity. Measures of statistical dispersion (range, h-spread, standard deviation, and variance) and of statistical shape (kurtosis, absolute kurtosis, skewness, and absolute skewness) were calculated from each pixel's set of sub-pixels, giving a measure of its in-situ heterogeneity for the vessel transit. The cumulative difference in sequential measurements along a pixel transit, and the mean difference in sequential measurements over a pixel transit, were also calculated, as additional measures of statistical dispersion.
Bias introduced by excessive sampling during a single annual quarter, was addressed by stage three. The mean heterogeneity values of the cruises within each annual quarter were calculated. The annual mean was then derived from these four values. The final annual value is thus more spatially and temporally comparable to that pixel's OHMA-derived STH value. A Spearman's rank correlation analysis was then used to compare each in-situ-derived heterogeneity measure to its STH counterpart. This was seen as adequate Sampling frequency, coupled with the high variability in vessel speed and area covered, increases the potential for a spatial sampling bias. In an effort to reduce this, stage two sub-divided each of the 14 selected pixels (spatial resolution~0.102 • ) into a 25 × 25 grid of sub-pixels (spatial resolution 0.004 • ). The mean temperature of each sub-pixel was obtained, with the full pixel heterogeneity measures calculated from the sub-pixel values. In four cases, a vessel transit gathered less than five measurements across a pixel. In these cases, to maximise the number of measurements available, ten measurements before and after the research vessel crossed the adjacent pixel boundary were used. These were subpixellated, and averaged with the central measurements to calculate transit heterogeneity. Measures of statistical dispersion (range, h-spread, standard deviation, and variance) and of statistical shape (kurtosis, absolute kurtosis, skewness, and absolute skewness) were calculated from each pixel's set of sub-pixels, giving a measure of its in-situ heterogeneity for the vessel transit. The cumulative difference in sequential measurements along a pixel transit, and the mean difference in sequential measurements over a pixel transit, were also calculated, as additional measures of statistical dispersion.
Bias introduced by excessive sampling during a single annual quarter, was addressed by stage three. The mean heterogeneity values of the cruises within each annual quarter were calculated. The annual mean was then derived from these four values. The final annual value is thus more spatially and temporally comparable to that pixel's OHMAderived STH value. A Spearman's rank correlation analysis was then used to compare each in-situ-derived heterogeneity measure to its STH counterpart. This was seen as adequate to provide information on the STH datasets validity, noting sub-optimal sample size (14 samples). It also allowed the potential interpretation of an SST-derived STH value to be explored.

Characterising the STH Product Using Physical Phenomena
Using Generalised Linear Modelling, the 2011 STH dataset was compared to a number of modelled fronts, currents and bathymetric products generated using data from different satellite sensors, and in-situ measurements ( Table 1). The focus here was to determine if the STH features were related to physical ocean surface phenomena, and geophysical sub-surface features. Sea surface Temperature Front (SSTF) data were generated and provided by Plymouth Marine Laboratory as outlined in [45,49], for the NAA area. Discrete 7-day front images were generated for 2011 (1-7 January, 8-15 January, etc.) from weekly composited microwave and infrared SST data at 9 km spatial resolution. A threshold of a minimum SST step of 0.6 • C across a pixel, was used to flag and identify the presence of a thermal front in the SST imagery [50]. Note that a single-day front image for 31 December was incorporated into the dataset to match the OHMA data inputs (1 January-31 December 2011). Table 1 shows the range of statistical summaries that were derived from these SSTF images.
Modelled sea surface current data, produced under the Globcurrent initiative [46] were obtained freely from the Copernicus Marine Environment Monitoring Service (CMEMS) portal [51]. Discrete 1-day surface current images, corresponding to the OHMA input datasets (i.e., 1 January, 8 January, etc.) were subset to the NAA area. These were processed into directionless, geostrophic current speed images in ms −1 at weekly intervals, and at 0.25 • spatial resolution. Table 1 presents the list of summary statistics calculated. Bathymetry data for all four case study areas were obtained from the General Bathymetric Chart of the Oceans (GEBCO), developed through the Nippon Foundation-GEBCO Seabed 2030 project [47,48]. The GEBCO_2019 data product provides global bathymetric coverage on a 15 arc-second grid. Data values refer to elevations at the centre of each grid cell. The elevation models are generated by assimilating heterogeneous data types at mean sea level. The GEBCO grid created uses the 'base' Version 1 of the SRTM15+ dataset [52,53], augmented with datasets from a number of international and national data repositories and regional mapping initiatives. Three measures of the ocean floor (slope, aspect and absolute bathymetry) were derived from the GEBCO dataset (Table 1).

Characterisation Approach
The various statistical summary images of bathymetry, SSTF and SSSp image data were re-sampled to match the spatial resolution of the STH dataset (~0.102 • ), using a centroidregridding approach. Re-sampling parameter datasets to the lowest spatial resolution dataset (SSSp), could have introduced a higher level of information loss from the SSTF dataset. With this in mind, the STH resolution was selected as it lay between that of both the lowest resolution (SSSp), and highest resolution (SSTF) products, minimising any imbalance in information loss. An area-weighted number of random sampling points were generated for each case study area independently. While the geographic coordinate system WGS84 resulted in a slight sampling bias in favour of northern latitudes, it was considered acceptable for this study. Each point was populated with the co-located STH value and the comparative parameter values. For each case study, a Generalised Linear Model (GLM) [54,55] was developed with the STH values as the dependent variable. This was done using the GGally, lme4, MuMIn, car and MAS packages in an R programming framework. A GLM enables the regression analysis of a variety of non-normal responses such as binary indicators, counts and positively valued random variables [55]. The STH distributions were examined, and determined to be non-normally distributed. For each case study, a pairwise correlation analysis of all the independent parameter variables was applied, with a threshold of 0.45. This identified potentially inter-related factors in the case study area. If inter-relations were evident, one representative factor of the inter-related group was selected for input to the case study GLM. The GLM was developed by testing all possible permutations (Dredging) of the selected variables. The most straightforward model was selected from those with the lowest Akaike Information Criterion (lowest AIC +/− 8). An over-dispersion threshold of 1.5, above which corrective action is needed to address it [56], was set to determine the success of poisson-, quasi-poisson-and negative binomial models in explaining STH variability. For all case study areas, only negative binomial-based models were acceptable according to this threshold. Full details for each GLM are available in Appendix A. Each GLM was then interpreted qualitatively, using available literature on the area. Figure 6 shows a map of the NAA region, using the surface spatio-temporal heterogeneity (STH) dataset. This dataset was generated by applying the OHMA process to a 7-day temporal resolution SST dataset for 2011. Pixels which contain a transition between valid ocean data points and mask value (land or invalid data), contained the maximum value of 44. These extreme value artifacts are highlighted as features A-E in Figure 6, and indicate the OHMA process performed as expected. Homogeneous areas, with no evidence of cluster boundaries throughout the classification runs, had a STH value of 0, (e.g., see the area east of Newfoundland, Canada represented by F in Figure 6 and examined in Figure 7a). Areas with STH values between 1 and 43 represent low to high STH, respectively. Within the valid areas, there are two extremes of STH values. An example of an area with high STH values was observed off the eastern shores of Iceland (feature G in Figure 6). Figure 7b shows this area in more detail, with a marked difference in the temporal SST signal of data pixels to the north and south of the strong elongated STH feature. Such strong STH features are evident in a number of areas across the NAA region, such as area H ( Figure 6) in the vicinity of the Newfoundland Basin. Validation results, (described in Section 3.2), emphasized that higher STH pixels were associated with the presence of locally extreme values, or higher gradient transitions across adjacent areas, where there is a sudden spatial change in the surface water variable measurements. Low to medium level STH values dominate the STH map. For example, marine waters off Ireland, where in-situ data were used for validation purposes (area I in Figures 5 and 6).

OHMA Output: A Surface Spatio-Temporal Heterogeneity (STH) Dataset
as area H ( Figure 6) in the vicinity of the Newfoundland Basin. Validation results, (described in Section 3.2), emphasized that higher STH pixels were associated with the presence of locally extreme values, or higher gradient transitions across adjacent areas, where there is a sudden spatial change in the surface water variable measurements. Low to medium level STH values dominate the STH map. For example, marine waters off Ireland, where in-situ data were used for validation purposes (area I in Figures 5 and 6). Figure 6. The STH map produced from 2011 data for the region of interest (North Atlantic), with highlighted features. The figure shows the artifacts to expect from the OHMA process, namely the highest STH value (44) for A-coastlines, Bice/land-linked poor data pixels, C-mixed coastal data and poor data areas, D-islands, and E-isolated poor data areas. It also shows a sample of STH features which are valid pixel estimates, including F-the Northeast Newfoundland Shelf area exhibiting no detectable cluster boundaries (i.e., a spatio-temporally homogeneous region), G-the Iceland and Faroe shelf front region, H-highly heterogeneous waters in and around the Newfoundland Basin, and I-more muted and subtle heterogeneity features in Irish territorial waters. The figure shows the artifacts to expect from the OHMA process, namely the highest STH value (44) for A-coastlines, B-ice/land-linked poor data pixels, C-mixed coastal data and poor data areas, D-islands, and E-isolated poor data areas. It also shows a sample of STH features which are valid pixel estimates, including F-the Northeast Newfoundland Shelf area exhibiting no detectable cluster boundaries (i.e., a spatio-temporally homogeneous region), G-the Iceland and Faroe shelf front region, H-highly heterogeneous waters in and around the Newfoundland Basin, and I-more muted and subtle heterogeneity features in Irish territorial waters.
The OHMA process also produces (i) an image dataset of the optimal number of clusters that represent the variability in the hypertemporal datacube, and (ii) a signature file with the temporal statistical summaries of each cluster. Figure 8 highlights how cluster extent and temporal signature can be combined to produce map classes (e.g., Class 3), which can then be displayed with the STH information in a map format. As pixels are grouped by their temporal similarity, pixels in a cluster which are bounded by high STH values highlight regionally homogeneous areas, such as Class 3 in Figure 8. The OHMA process also produces (i) an image dataset of the optimal number of clusters that represent the variability in the hypertemporal datacube, and (ii) a signature file  The matching temporal profiles of the four clusters, depicted here using mean temperature, and standard deviation as a measure of variability across cluster pixels (error bars). Note that cluster colour is not an indication of temperature, with a pink arrow indicating the sequence of classes in both panels.

Insights from Validating the STH Product
Correlation analysis of the 14 validation sites (data pixels) off the Irish coast determined that higher STH pixels were associated with the presence of locally extreme values, or higher gradient transitions in transit measurements acquired by research vessel. Whilst the correlations do not imply causation, they do provide evidence of potential associations. Significant associations were identified in heterogeneity measures concerned with the shape of the probability distribution of the transit temperature measurements, namely between the STH and the absolute skewness (Spearman's Rank, n = 14, ρ [12] = +0.5755, p < 0.05), and kurtosis (Spearman's Rank, n = 14, ρ [12] = 0.5446, p < 0.05). These shape changes can be a consequence of (i) encountering locally extreme measurement values, and/or (ii) the increased presence of sharper transitions in the transit measurements (for further information, see Appendix B). Concerning the kurtosis results, there was evidence of sampling bias, with more highly sampled pixels associated with increasingly lighttailed, platykurtic distributions (Spearman's Rank, n = 14, ρ [12] = −0.4928, p < 0.1). There were no statistically significant associations evident between pixel STH, and in-situ (transit) heterogeneity measures classed as measures of dispersion (range, h-spread, variance, standard deviation, cumulative difference in sequential measurements, and mean cumulative difference in sequential measurements). Sampling size bias may have had an influence here, with short transits showing little dispersion in values.

Preliminary Regional Insights from Characterising the SST STH Product
The GLM descriptions for each case study area are presented in Table 2. The variability in STH was not solely explained by one factor (fronts or ocean currents or bathymetry) in any case study region. Each case study required a bespoke GLM composed of differing parameters. For the larger NAA region, factors associated with bathymetry, sea surface currents and sea surface temperature fronts explained approximately 27% of the variability evident in the STH dataset. Shallower waters were associated with higher STH The matching temporal profiles of the four clusters, depicted here using mean temperature, and standard deviation as a measure of variability across cluster pixels (error bars). Note that cluster colour is not an indication of temperature, with a pink arrow indicating the sequence of classes in both panels.

Insights from Validating the STH Product
Correlation analysis of the 14 validation sites (data pixels) off the Irish coast determined that higher STH pixels were associated with the presence of locally extreme values, or higher gradient transitions in transit measurements acquired by research vessel. Whilst the correlations do not imply causation, they do provide evidence of potential associations. Significant associations were identified in heterogeneity measures concerned with the shape of the probability distribution of the transit temperature measurements, namely between the STH and the absolute skewness (Spearman's Rank, n = 14, ρ [12] = +0.5755, p < 0.05), and kurtosis (Spearman's Rank, n = 14, ρ [12] = 0.5446, p < 0.05). These shape changes can be a consequence of (i) encountering locally extreme measurement values, and/or (ii) the increased presence of sharper transitions in the transit measurements (for further information, see Appendix B). Concerning the kurtosis results, there was evidence of sampling bias, with more highly sampled pixels associated with increasingly light-tailed, platykurtic distributions (Spearman's Rank, n = 14, ρ [12] = −0.4928, p < 0.1). There were no statistically significant associations evident between pixel STH, and in-situ (transit) heterogeneity measures classed as measures of dispersion (range, h-spread, variance, standard deviation, cumulative difference in sequential measurements, and mean cumulative difference in sequential measurements). Sampling size bias may have had an influence here, with short transits showing little dispersion in values.

Preliminary Regional Insights from Characterising the SST STH Product
The GLM descriptions for each case study area are presented in Table 2. The variability in STH was not solely explained by one factor (fronts or ocean currents or bathymetry) in any case study region. Each case study required a bespoke GLM composed of differing parameters. For the larger NAA region, factors associated with bathymetry, sea surface currents and sea surface temperature fronts explained approximately 27% of the variability evident in the STH dataset. Shallower waters were associated with higher STH values and a positive relationship was found between STH and mean sea-surface current speeds. Furthermore higher STH areas had an increased number of SST fronts, and a positive relationship with the strength of these fronts. All factors in the NAA GLM were highly significant (p-value < 0.0001; Table 2). For the NBA sub-region, approximately 32% of the STH variability could be explained by a combination of factors. The seabed gradient had a strong influence on STH values (p-value < 0.0001; Table 2). Locally higher STH areas were related to areas of higher gradient transitions in sea-surface current speeds (p-value < 0.0001; Table 2). This was also reflected in front counts (p-value < 0.0001; Table 2), and the maximum front strength to a lesser significance (p-value <0.01; Table 2). The number of detected SST fronts was also a highly significant predictor of STH values in GLMs for the CSR and IFF sub-regions. However, these two regions differed in the second influence underpinning the STH values. For the CSR, bathymetry characteristics were not good predictors of STH, and mean sea surface current speeds were negatively related to STH. In contrast, the IFF GLM showed that shallower absolute bathymetry exhibited high STH prediction, as did areas where there was an increased seabed slope. There was little or no influence of sea surface current speeds on STH values found in the IFF GLM. All factors in the CSR and IFF models were highly significant (p-value < 0.0001; Table 2), explaining approximately 17% (CSR) and 43% (IFF) of the variability in the two sub-regions.  Significance: *** p < 0.0001, * p < 0.01.
Drawing on existing oceanographic knowledge of the North Atlantic basin, preliminary interpretation of co-located patterns in the STH dataset (heterogeneity and homogeneity features) is possible.

North Atlantic Area (NAA)
Two well-known features in the Atlantic Ocean include the North Atlantic's (i) Subpolar Gyre, and (ii) Subtropical Gyre [33]. In the North Atlantic, the Subpolar Gyre is associated with a number of important overturning circulation currents, driven by changes in atmospheric forcing (winds and air-sea heat exchange). It is cyclonic in nature, and its associated boundary currents move around the North Atlantic in an anticlockwise direction. The spatio-temporal heterogeneity map (Figures 6 and 8) shows higher STH values where some of the boundary currents are located (e.g., the North Atlantic Current, East Greenland and Labrador Currents shown in Figure 10). The lower part of the STH map (Figures 6 and 8) shows the northern boundary of the Subtropical Gyre to the south. This demonstrates the potential in using heterogeneity features such as Earth Observation-derived fronts to reveal details of surface currents, which are not resolved by in-situ or altimetry measurements [57]. The STH dataset also shows a similar pattern to cold surface waters moving southwards in the Labrador Current. Boundaries depicted around the homogenous area F of Figure 6, and Class 3 in Figure 8, resemble the extent of the continental shelf ice-influenced pelagic habitat, identified in other ocean surface partitioning studies [4,58] Remote Sens. 2021, 13, x FOR PEER REVIEW (Figures 6 and 8) shows the northern boundary of the Subtropical Gyre to the so demonstrates the potential in using heterogeneity features such as Earth Observ rived fronts to reveal details of surface currents, which are not resolved by in-s timetry measurements [57]. The STH dataset also shows a similar pattern to col waters moving southwards in the Labrador Current. Boundaries depicted aroun mogenous area F of Figure 6, and Class 3 in Figure 8, resemble the extent of the co shelf ice-influenced pelagic habitat, identified in other ocean surface partitionin [4,58] Figure 11 shows an isolated ovoid STH feature in the central Newfoundlan This feature is located where a semi-permanent, mesoscale (~140 km radius) anti eddy known as a Mann Eddy is expected [63]. The Mann Eddy, first described [32], is located in the vicinity of the North Atlantic Current. It plays an importa the circulation, air-sea interaction, and climatology of the region [63], and is "n ways observed offshore of the North Atlantic Current near 42° N" [64]. Howeve examination of the temporal SST development of the feature, and surrounding w ses (Figure 11), indicates the STH feature is not distinguished solely by its eleva expression, as would be expected if SST alone could determine this phenome cooler water period also identifies it as a surface water class. This indicates th ocean surface processes influence the regional SST signals, and offers an oppor study and clarify the regions surface water dynamics.  Figure 11 shows an isolated ovoid STH feature in the central Newfoundland Basin. This feature is located where a semi-permanent, mesoscale (~140 km radius) anti-cyclonic eddy known as a Mann Eddy is expected [63]. The Mann Eddy, first described in 1967 [32], is located in the vicinity of the North Atlantic Current. It plays an important role in the circulation, air-sea interaction, and climatology of the region [63], and is "nearly always observed offshore of the North Atlantic Current near 42 • N" [64]. However, further examination of the temporal SST development of the feature, and surrounding water classes (Figure 11), indicates the STH feature is not distinguished solely by its elevated heat expression, as would be expected if SST alone could determine this phenomenon. The cooler water period also identifies it as a surface water class. This indicates that other ocean surface processes influence the regional SST signals, and offers an opportunity to study and clarify the regions surface water dynamics. Remote Sens. 2021, 13, x FOR PEER REVIEW 19 of 35 Figure 11. Six selected daily resolution SST and Surface Current (direction denoted by arrows, and speed by size and colour) images, taken throughout 2011, in the Newfoundland Basin Area (NBA). Images are overlaid with the 2011 STH dataset. The potential signature of the Mann Eddy (ME) is labelled ME throughout the sequence. Daily current direction and magnitude are derived from the GlobCurrent data product [46].

Celtic Seas Region (CSR)
The STH map extending over the Celtic Seas Region (CSR) exhibits some characteristics that are similar to a number of regionally important oceanographic features. Mediumintensity STH aligns with regionally important features, such as the thermohaline driven Irish shelf front situated off the coast of Ireland [65], and the Celtic Sea front [66] that separates the Celtic Sea, from the Irish Sea, particularly during the summer months [67,68]. The STH dataset clearly separates Irish coastal waters from offshore oceanic shelf waters, as they feature distinctive temporal profiles (Figure 12, classes 21 and 24). Beaugrand et al. identified the Celtic Sea region as a biogeographic crossroads, noting that in this area, oceanic, neritic and pseudo-pelagic species cohabit, with warm-and cold-water species regularly co-occuring [58]. They highlight the potential for complex coenoclines (a gradient of associated communities), and associated ecosystems, ecoclines and ecotones to exist in the area. Similar observations were noted by Barnard et al. in their investigation into plankton communities of the North Atlantic [69].
The STH map extending over the Celtic Seas Region (CSR) exhibits some characteristics that are similar to a number of regionally important oceanographic features. Medium-intensity STH aligns with regionally important features, such as the thermohaline driven Irish shelf front situated off the coast of Ireland [65], and the Celtic Sea front [66] that separates the Celtic Sea, from the Irish Sea, particularly during the summer months [67,68]. The STH dataset clearly separates Irish coastal waters from offshore oceanic shelf waters, as they feature distinctive temporal profiles (Figure 12, classes 21 and 24). Beaugrand et al. identified the Celtic Sea region as a biogeographic crossroads, noting that in this area, oceanic, neritic and pseudo-pelagic species cohabit, with warm-and coldwater species regularly co-occuring [58]. They highlight the potential for complex coenoclines (a gradient of associated communities), and associated ecosystems, ecoclines and ecotones to exist in the area. Similar observations were noted by Barnard et al. in their investigation into plankton communities of the North Atlantic [69]. Figure 13 shows the STH dataset overlying SSSp and SSTF counts, the two variables identified in the GLM as significant in this region. An inverse relationship between STH and surface current speed is clearly visible in the NW quadrant of Figure 13a. SSSp is regionally high over the Rockall trough, a deep underwater canyon. The STH values are low in adjacent waters, and high STH values are visible to the south (Porcupine Bank), east (Malin Shelf) and northwest (Rockall Bank). Here, the elevated STH values are associated with higher SST front counts (Figure 13b). It is notable that the strength of these many fronts was relatively weak, in comparison to those elsewhere in the North Atlantic region.

Figure 12.
Coastal and pelagic surface spatio-temporal heterogeneity around Ireland. The map shows the distribution of seven water classes in, and adjacent to, the Celtic and Irish Seas, overlaid by the STH dataset. The legend contains the temporal profiles described by their means and standard deviation, for those classes linked to the coastline, and those pelagic classes further out to sea. Colours are simply used to visualize a class, they do not denote any scale or gradient in temperature between classes.

Figure 12.
Coastal and pelagic surface spatio-temporal heterogeneity around Ireland. The map shows the distribution of seven water classes in, and adjacent to, the Celtic and Irish Seas, overlaid by the STH dataset. The legend contains the temporal profiles described by their means and standard deviation, for those classes linked to the coastline, and those pelagic classes further out to sea. Colours are simply used to visualize a class, they do not denote any scale or gradient in temperature between classes. Figure 13 shows the STH dataset overlying SSSp and SSTF counts, the two variables identified in the GLM as significant in this region. An inverse relationship between STH and surface current speed is clearly visible in the NW quadrant of Figure 13a. SSSp is regionally high over the Rockall trough, a deep underwater canyon. The STH values are low in adjacent waters, and high STH values are visible to the south (Porcupine Bank), east (Malin Shelf) and northwest (Rockall Bank). Here, the elevated STH values are associated with higher SST front counts (Figure 13b). It is notable that the strength of these many fronts was relatively weak, in comparison to those elsewhere in the North Atlantic region.

Iceland-Faroes Front Region (IFF)
The surface spatio-temporal heterogeneity (STH) map and key classes are used to describe the IFF region ( Figure 14). Here, the GLM identified bathymetry as an important characteristic, with the highest STH values observed in shallower waters (Figure 15a). This was most pronounced in locations with steeper gradients over a short seabed distance (Figure 15b). SST Front counts, while often weak, also exerted an important influence in the GLM (see Figure 15c). The poleward flow of surface Atlantic Ocean waters across the Greenland-Scotland ridge, through the Nordic Seas, and into the Arctic Ocean plays a fundamental part in ocean heat transport and thermohaline circulation in the region [70]. Approximately 50% of warm-saline surface inflows [70] of modified North Atlantic water into the Nordic Seas follows a path between Iceland and the Faroe Islands. In this dynamic region, the surface warm saline Atlantic water meets colder, less saline waters. This confluence of different water types is marked by the Icelandic-Faroe Front [71][72][73]. To the east, the Faroe Shelf Front [74,75] marks the region where the Atlantic waters meet tidally well-mixed waters on the Faroe shelf. Both these boundary regions were highlighted remarkably well by high STH values in this study ( Figure 14). Furthermore, bathymetric and shelf influences on these surface to bottom frontal systems were also captured in the GLM for the region. This echoes the findings of some hydrographic frontal studies, where a general dependence of frontal locations on bathymetry was noted [71]. The IFF area is a key example of why the study of surface spatio-temporal heterogeneity needs to consider the influence of sea bed topography on the surface behaviour. Interpretation of a STH dataset requires the researcher to understand the ocean as a four dimensional system, with processes and features at the ocean depths influencing the ocean surface signals to differing degrees over time.

Iceland-Faroes Front region (IFF)
The surface spatio-temporal heterogeneity (STH) map and key classes are used to describe the IFF region ( Figure 14). Here, the GLM identified bathymetry as an important characteristic, with the highest STH values observed in shallower waters (Figure 15a). This was most pronounced in locations with steeper gradients over a short seabed distance (Figure 15b). SST Front counts, while often weak, also exerted an important influence in the GLM (see Figure 15c). The poleward flow of surface Atlantic Ocean waters across the Greenland-Scotland ridge, through the Nordic Seas, and into the Arctic Ocean plays a fundamental part in ocean heat transport and thermohaline circulation in the region [70]. Approximately 50% of warm-saline surface inflows [70] of modified North Atlantic water into the Nordic Seas follows a path between Iceland and the Faroe Islands. In this dynamic region, the surface warm saline Atlantic water meets colder, less saline waters. This confluence of different water types is marked by the Icelandic-Faroe Front [71][72][73]. To the east, the Faroe Shelf Front [74,75] marks the region where the Atlantic waters meet tidally well-mixed waters on the Faroe shelf. Both these boundary regions were highlighted remarkably well by high STH values in this study ( Figure 14). Furthermore, bathymetric and shelf influences on these surface to bottom frontal systems were also captured in the GLM for the region. This echoes the findings of some hydrographic frontal studies, where a general dependence of frontal locations on bathymetry was noted [71]. The IFF area is a key example of why the study of surface spatio-temporal heterogeneity needs to consider the influence of sea bed topography on the surface behaviour. Interpretation of a STH dataset requires the researcher to understand the ocean as a four dimensional system, with processes and features at the ocean depths influencing the ocean surface signals to differing degrees over time.

Discussion
The OHMA method to map ocean surface spatio-temporal heterogeneity is a datadriven, unbiased approach, which exploits the hypertemporal resource of remote sensingderived datasets. It successfully adapts the LaHMa method [28] for use in ocean sciences. The addition of a Jeffries-Matusita analysis step enables the entire process to be automated, and executed without prior knowledge of the region under examination.

Discussion
The OHMA method to map ocean surface spatio-temporal heterogeneity is a datadriven, unbiased approach, which exploits the hypertemporal resource of remote sensingderived datasets. It successfully adapts the LaHMa method [28] for use in ocean sciences. The addition of a Jeffries-Matusita analysis step enables the entire process to be automated, and executed without prior knowledge of the region under examination.
The use of OHMA can enhance efforts to understand the regional structure of the ocean surface. It is known that the ocean surface can be organised into regions with distinct abiotic and biotic properties, that change over space and time [76]. One approach to understanding this surface diversity (heterogeneity), is Ocean Surface Partitioning (OSP) [4,58]. OSP aims to effectively group areas of the ocean surface together into delineated ocean 'provinces'. It is increasingly seen as an approach to help understand the patterns of ocean spatio-temporal variability [77], and the impacts of this variability [78]. Krug et al. give an extensive insight into the range of OSP methodologies and applications available [76]. More recently, there has been discussion on the benefits of enhancing static outputs (e.g., those of [4], and [58]) with more dynamic outputs (e.g., [77,79,80]). There is also discussion concerning the range of datasets which can be input to a partitioning strategy [58], and whether abiotic parameters, biotic parameters, or both, should be used to define provinces [4,58,77,81].
Whilst these discussions continue, there are opportunities to advance different perspectives on partitioning the ocean surface. Currently, OSP is inherently reliant on the grouping of ocean surface areas into coherent, well-defined and bounded areas of similarity. These areas can then be described using the input parameters, e.g., through defining ecosystem partitions [77], biological partitions [78], biogeographical partitions of varying complexities [4,58], or even phenological metrics [81]. These partitions rely on classification, effectively grouping homogeneous regions together and delineating their extents. Less obviously stated, is that by grouping areas together, we highlight their difference from other areas, and the nature of transitions between them. However, there is limited consideration of the influence of sharper spatio-temporal transitions in regional abiotic properties, and their impact on regions of increased photosynthetic activity. The STH information produced using the OHMA process could be particularly supportive as an input to OSP classification approaches. This is especially relevant given this research has shown that the OHMA-derived STH data product is a hybrid product, capturing several aspects of heterogeneity in a single measure.
Furthermore, there is an opportunity to explore whether the biological characteristics of a region are underpinned by a transition in abiotic conditions within it. In this study, the mapped, SST-based heterogeneity features aligned with known bio-geographical boundaries highlighted previously by Ocean Surface Partitioning (OSP) studies [4,58,79]. Areas which feature higher STH values can be considered as more permanent and/or stronger water mass boundary areas. Conversely, weaker STH values indicate regions where boundaries are weaker and/or more mobile and transitory over space and time.
This fluid, impermanent nature of ocean boundaries highlights a further opportunity. Multiple authors have noted the limitations of adopting purely static approaches to OSP [79,80]. Instead they adopted more dynamic methods to delineate provinces. These have consistently enhanced the delineation of biogeochemical province boundaries, while producing a number of maps to chart changes over seasons and years. The STH mapping proposed here can support such efforts. It offers insights into the characteristic permanence of statically mapped boundaries, on the basis of the input parameter. As such, pre-examination of a region using an OHMA approach could help objectively determine whether or not a more dynamic methodology is required.
The validation efforts highlighted a gap in in-situ data availability. 14 pixels were available with a sufficient temporal sampling quality to be comparable to the temporally rich satellite-derived SST dataset. While the associations between STH and in-situ skewness and kurtosis could be clarified, the sampling volume was not sufficient to properly examine the measures of dispersion. From the in-situ perspective, this information is only available for 14 pixel locations in Irish coastal waters, for a dataset spanning the North Atlantic. It involved processing an entire year of all available cruise data from two vessels, to extract temporally comparable measures to the satellite-derived product. As such, it is not currently feasible, using only in-situ data, to produce surface spatio-temporal heterogeneity maps at the spatial and temporal scales offered by OHMA. This not only emphasized the potential for STH datasets to complement existing efforts, it also demonstrated an observation gap at the in-situ level, which could be addressed. In addition to the Underway data collected by the Marine Institute [41,43], there are a number of examples of in-situ measurements being taken in transit and made available (for example data available through the GOSUD initiative [82], and the Shipborne-radiometer Network [83]). The challenging nature of the validation work highlighted the need to establish a set of dedicated spatio-temporal validation sites specifically measured using vessel transits. There is an opportunity for the ocean remote sensing community, and in-situ ocean observing community to identify key areas to sample, and direct vessels through. Establishing these sites would help to fully unlock the potential of hypertemporal data archives, and spatio-temporal data products.
The GLM element of this study was preliminary in nature, seeking to place the STH product in terms of available, recognised heterogeneity products. The multi-factor models, with highly significant, though low, R 2 values emphasized that (i) regional surface water heterogeneity was driven by a range of factors, and (ii) there remains unidentified drivers of surface water STH, far beyond what is typically used (fronts, currents and bathymetry). Future work would be well advised to introduce a number of other parameters to the regional models, such as Sea Surface Salinity [84], or altimetry-derived sea level products ( [85]) for example. In particular Sea Surface Salinity and sea level products would be well positioned to factor into GLM models for open ocean-focused studies, given their good performance in waters that are further offshore [84][85][86]. Other modelled parameters such as mixed-layer depth could also be considered if available.
In spite of its preliminary nature, the GLM study demonstrated the hybrid nature of the STH dataset. It captured different aspects of heterogeneity at the NAA scale, and in each different case study sub-region, yet delivers the information as a single measure. As such, the unique nature of the STH product was highlighted, as was its potential to serve as a single heterogeneity measure to support biodiversity and habitat mapping. Further work on multi-parameter studies investigating both the physical and biotic environment could be conducted using OHMA-derived STH datasets. This could clarify whether (i) the same logic applies across parameters, and (ii) whether STH datasets can enhance our understanding of biophysical systems in ocean surface waters.
Such systemic studies of the ocean surface need to be founded on understandings of regional oceanographic characteristics and features. This study demonstrated the utility of STH mapping to support such work. STH datasets, in combination with their ancillary classification datasets, reveal surface units which share high strength boundaries. These oceanic ecotones with pronounced gradient changes over small spatial scales, are evident in the NBA and IFF regions (shown as features G and H in Figure 6). These areas feature well studied frontal systems, co-located with high STH values. The high STH values highlight regions with greater mixing of water masses, potentially greater per-unit area diversity in habitat niches over time, and the presence of high-gradient transitions which can be exploited by species (for example Tuna species [18,87]), and thereby shape their distributions.
Meanwhile, contrasting this are surface units which do not share high, or even medium strength boundaries. These oceanic ecoclines (such as F in Figure 6), are regions across which the temperature profiles of the surface waters change, but gradually over spatial scales, without clear separable persistent boundaries between the surface water classes. These ecoclines are potentially useful. Their surrounding boundaries potentially delineate extensive regions, across which surface conditions and temporal fluctuations change gradually. The direction of change is indicated by both the mapped ecotones and ecoclines, coupled with the surface water class locations. Use of regular, persistent features and their spatial locations could be useful in targeting in-situ surface water studies specifically looking at change. It offers the potential to objectively chart sampling transects to capture the spatio temporal gradients, and enhance the cost-benefit ratio of ocean-going research cruises measuring them. These bounded ecoclines are also useful for the ocean remote sensing community, in particular those conducting time-series analyses on remotely sensed data. Time-series analysis can be used to explain and characterise the occurrence of observed temporal phenomena [88]. However, its use in ocean studies is restricted by the need for experience, and a-priori knowledge of regional seasonalities [26], used to guide model training [89]. Here, the suite of OHMA output datasets could be of use. The STH boundaries, classification maps and temporal legends can provide insights into the diversity of temporal signals the modeler must account for. The suite can also be used to define the larger more homogeneous regions for temporal sampling, particularly when seeking to define the endmembers needed for Temporal Mixture Analyses [90]. As such the objective nature of the OHMA process enhances the opportunities for these more restricted approaches to be deployed. This objective capability to quantify natural mobile discontinuities (ecoclines), and continuities (ecotones), also shows the approach could support the development of seascape [91] characterisations. Such applications have already been demonstrated in terrestrial landscape studies [11,28], noting that Kavanaugh et al. emphasize the need to characterise planktonic processes in support of seascape characterisations [91].
The methodology also captured information on a range of oceanographic features, which are difficult to measure at large spatial scales using in-situ measurements alone. Features such as the boundaries and meanders of the Gulf Stream at the surface (Figure 8), the Mann Eddy (Figure 11), the Celtic Sea Front (Figure 12), and the Icelandic and Faroes fronts (Figures 14 and 15), are a range of globally, and/or regionally important macroto mesoscale oceanic features, which are identifiably delineated by the STH dataset. It demonstrated the ability of the OHMA process to objectively quantify the spatial extents of these features, over a given timeframe, and gain a measure of their regional significance. A further step could see these features examined using more regionalised, and higher spatial and temporal resolution datasets. This could maximise their influence on the clustering process, and potentially better delineate them, and their spatio-temporal relevance. However, there would also be the risk that regionally, these features are not as distinctive as they are globally. Furthermore, as found by de Bie et al. when examining the Mekong Delta using the LaHMa method [28], higher spatial resolution imagery can misrepresent the diversity of the surface. In the Mekong study, this resulted in landscape-level gradient information being obscured beneath anthropogenic rural and urban boundaries. It remains to be seen whether deploying OHMA on finer spatial resolution data, will suffer from the same issues, given the lack of anthropogenic structures on the ocean surface.
Finally, it is unclear whether the OHMA process can adequately capture the temporal development of ocean surface features. Evidence from this study suggests that features such as near-surface currents in the Gulf Stream system, the Mann Eddy, and certain frontal systems (Figures 6, 10, 11 and 15) leave a signature in the measured SST image data, which can be highlighted and enhanced by the OHMA process. This could serve to highlight regions for further investigation, whereupon sea surface temperature front data could be generated using [49,50], or time-series of current characteristics could be generated using [92]. Alternatively shorter timeframes of more regionalised data could be input to generate OHMA-derived STH datasets. However, it remains unexplored whether the use of shorter temporal extents better delineate, or miss these features, and indeed, how the researcher should approach using OHMA to study features with a more rapid spatio-temporal character, such as fast moving mesoscale eddies. From the opposing perspective, the temporal stability of features could be examined. Some archives of SST now span climatological timeframes. This offers the potential to map and monitor more permanent ocean STH structures, quantify their temporal behaviour over climatological timeframes, and begin monitoring their stability as more heat energy circulates through Earth's systems.

Conclusions
The OHMA methodology is a robust tool suitable for the ocean science community to explore, examine, and describe the ocean's surface. It produces an objective higher-level heterogeneity product, which summarises information from hypertemporal image data, and successfully captures a range of ocean-surface heterogeneity perspectives (such as fronts, currents, and locally extreme values). In doing so, it advances our capability to map ocean surface spatio-temporal heterogeneity, with the potential to provide information on biophysical processes which affect the distribution of marine species.
Similar to the LaHMa method, the output STH dataset captures and presents oceanic ecotones and ecoclines, particularly when viewed alongside the ancillary classification dataset and legend which are produced as part of the process. In doing so, it enables us to objectively map regions of spatio-temporal change. This study described and demonstrated the validity, and potential utility, of this adapted STH mapping algorithm for ocean sciences, tailored to exploit hypertemporal Earth Observation imagery. Validation efforts indicate the algorithm detects locally increased transitions in the input ocean surface parameter (here demonstrated using SST). Characterisation using a range of ocean parameters showed the OHMA approach provides a high-level perspective of ocean surface heterogeneity. This enables the identification of high-change regions, and subsequent exploration using other measured/modelled parameters, or more targeted site investigations. Using hypertemporal SST imagery as an input, the algorithm proved effective at highlighting areas of increased SST front frequency, indicators of boundaries between water masses. Perspectives on surface currents and the underlying bathymetry were also related, to differing degrees, and in different combinations, to STH in the suite of case study areas. A number of future avenues for work were highlighted including (i) using regional coastal studies to examine the performance of the algorithm in more complex scenarios, (ii) examining the algorithm performance on a range of datasets at different temporal and spatial scales, and (iii) exploring ocean surface systems using STH products derived from a range of ocean surface parameters. The study also highlighted the need to establish dedicated spatio-temporal ocean validation sites, specifically measured using surface transits, to support advances in hypertemporal ocean data use, and exploitation.    7), and participants in the Irish Earth Observation Symposium 2018's water session, for advice and quality appraisal on the research approach and statistical analyses for OHMA validation and characterisation. They would like to thank Emma Chalençon for reviewing the language style used and replicability of the research. The authors would also like to thank the anonymous reviewers, whose efforts in reviewing this considerably long piece of work, and providing the concise and detailed feedback, were very much appreciated.

Conflicts of Interest:
This research was conducted as part of a program of PhD candidate research being undertaken by the lead author (Mr. R.G. Scarrott). The funders had no role in the design of the study, in the collection, analyses, or interpretation of data, in the writing of the manuscript, or in the decision to publish the results.

Appendix A. Details of the Case Study Generalised Linear Models
The distributions of each case study's set of STH values is show in Figure A1. Only nonparametric GLMs were considered as potential option, given the non-normal distributions, and the STH values being count data. Figure A2 shows the correlation analysis to identify groups of variables that were inter-correlated. A single representative variable was selected to represent a group. Table A1 shows the details of each case study's GLM, including the input variables, the selected variables, the model variables, and the criteria values that determined model selection. In all case studies, a negative binomial model best fit the data. The distributions of each case study's set of STH values is show in Figure A1. Only non-parametric GLMs were considered as potential option, given the non-normal distributions, and the STH values being count data. Figure A2 shows the correlation analysis to identify groups of variables that were inter-correlated. A single representative variable was selected to represent a group. Table A1 shows the details of each case study's GLM, including the input variables, the selected variables, the model variables, and the criteria values that determined model selection. In all case studies, a negative binomial model best fit the data.   Figure A2. An example of the correlation analysis to identify and remove inter-correlated parameters. The NAA case study areas correlation matrix is shown. Red boxes are overlaid to highlight any correlation value over the threshold (0.45). Blue boxes highlight those parameters that were selected as inputs to the dredging stage of GLM development. Figure A2. An example of the correlation analysis to identify and remove inter-correlated parameters. The NAA case study areas correlation matrix is shown. Red boxes are overlaid to highlight any correlation value over the threshold (0.45). Blue boxes highlight those parameters that were selected as inputs to the dredging stage of GLM development.  Figure A3 shows the hypothetical effect of extreme local values or gradients on the distribution shape of transit measurements, along a vessel path. These shape changes were measured as skewness and kurtosis.  Figure A3 shows the hypothetical effect of extreme local values or gradients on the distribution shape of transit measurements, along a vessel path. These shape changes were measured as skewness and kurtosis.

Appendix B. Relationship between Locally Extreme Values, and Distribution Shape
(a) (b) Figure A3. Hypothetical effect of locally extreme measurement values on the shape of the probability distribution for a series of transit measurements. (a) The hypothetical impact of extreme values on skewness is shown using two theoretical temperature transits. Red represents a more homogeneous transit in terms of values, with blue representing one with more extreme spatial transitions in temperature. Overlying probability distributions show how the presence of a locally extreme value skews the distribution of the vessel measurement. (b) The hypothetical impact of more (i) locally abrupt changes in values or (ii) locally high spatial gradients in measurements. Low change transits (and probability distribution) are depicted in red, with more heterogeneous transits (and probability distributions) in blue. The panel shows how the presence of higher spatial gradients increases the tail weight of the distribution (leptokurtic), whilst the inclusion of more extreme values lightens the distribution's tails (platykurtic). Figure A3. Hypothetical effect of locally extreme measurement values on the shape of the probability distribution for a series of transit measurements. (a) The hypothetical impact of extreme values on skewness is shown using two theoretical temperature transits. Red represents a more homogeneous transit in terms of values, with blue representing one with more extreme spatial transitions in temperature. Overlying probability distributions show how the presence of a locally extreme value skews the distribution of the vessel measurement. (b) The hypothetical impact of more (i) locally abrupt changes in values or (ii) locally high spatial gradients in measurements. Low change transits (and probability distribution) are depicted in red, with more heterogeneous transits (and probability distributions) in blue. The panel shows how the presence of higher spatial gradients increases the tail weight of the distribution (leptokurtic), whilst the inclusion of more extreme values lightens the distribution's tails (platykurtic).