Subtidal Natural Hard Substrate Quantitative Habitat Mapping: Interlinking Underwater Acoustics and Optical Imagery with Machine Learning

: Subtidal natural hard substrates (SNHS) promote occupancy by rich benthic communities that provide irreplaceable and fundamental ecosystem functions, representing a global priority target for nature conservation and recognised in most European environmental legislation. However, scien-tiﬁcally validated methodologies for their quantitative spatial demarcation, including information on species occupancy and ﬁne-scale environmental drivers (e.g., the effect of stone size on colonisation) are rare. This is, however, crucial information for sound ecological management. In this investigation, high-resolution (1 m) multibeam echosounder (MBES) depth and backscatter data and derivates, underwater imagery (UI) by video drop-frame, and grab sediment samples, all acquired within 32 km 2 of seaﬂoor in offshore Belgian waters, were integrated to produce a random forest (RF) spatial model, predicting the continuous distribution of the seaﬂoor areal cover/m 2 of the stones’ grain sizes promoting colonisation by sessile epilithic organisms. A semi-automated UI acquisition, processing, and analytical workﬂow was set up to quantitatively study the colonisation proportion of different grain sizes, identifying the colonisation potential to begin at stones with grain sizes Ø ≥ 2 cm. This parameter (i.e., % areal cover of stones Ø ≥ 2 cm/m 2 ) was selected as the response variable for spatial predictive modelling. The model output is presented along with a protocol of error and uncertainty estimation. RF is conﬁrmed as an accurate, versatile, and transferable mapping methodology, applicable to area-wide mapping of SNHS. UI is conﬁrmed as an essential aid to acoustic seaﬂoor classiﬁcation, providing spatially representative numerical observations needed to carry out quantitative seaﬂoor modelling of ecologically relevant parameters. This contribution sheds innovative insights into the ecologically relevant delineation of subtidal natural reef habitat, exploiting state-of-the-art underwater remote sensing


Introduction
Subtidal natural hard substrates (SNHS), whether of geogenic or biogenic origin, represent an important ecological habitat and cover essential functions in marine ecosystems [1], making them a worldwide priority target for environmental management, conservation, and restoration. Structural complexity at various scales enhances bentho-pelagic coupling, and subsequently, the associated ecological succession phases by acting as settling, shelter, feeding, and nursery grounds for a variety of marine organisms at multiple trophic levels [1,2]. In an anthropogenically threatened and rapidly changing marine environment [3,4], the heterogeneous and scattered distribution of SNHS represents keystone investigation is the exploitation of the UI in favour of a quantitative (i.e., continuous; numerical) as well as ecologically informed formulation of the criterion used to spatially delineate the targeted habitat.

Study Site
In Belgian waters, the existence of SNHS is relatively well documented, both historically ( [38] and references therein) and recently. Currently, maps of potential SNHS distribution, based on a combination of geological, geophysical, sedimentological, and visual observations, have been produced [39][40][41]. As exemplified by the "gravel bed potential distribution" map ( Figure 1),~16% of the Belgian seafloor relates to variable [42] SNHS, with a potential of stone occurrences. However, the spatial resolution of such maps remains too coarse for adequate ecological management of SNHS communities, requiring the accurate pinpointing of spatial units down to the level of meso-to fine-scale patches [43,44] and/or single stones [24]. With the advent of high-resolution seafloor mapping technologies and advanced seafloor modelling approaches [16], high-resolution research can now also be addressed towards the regional scale benefitting environmental monitoring obligations. In this scenario, the Belgian continental shelf (BCS) serves as an excellent case study to select a study area overlapping with the potential distribution map and test advancing ASC-based BHM methodologies. The study site is located beyond the 24 nautical miles, in offshore waters, at the northwesternmost delimitation of the BCS ( Figure 1) and covers~32 km 2 of MBES and ground truth survey area (Figures 1 and 2). Here, water depth range is 37-43 mLAT and is influenced by a semi-diurnal tidal regime, with mean ranges of~4.3 mLAT and~2.8 mLAT during spring and neap tides, respectively (referenced in Zeebrugge [45]). The tidal ellipse is mainly oriented in SW-NE direction, and the residual current transport is mainly oriented towards the NE [46]. The water mass is well mixed year-round with salinity values around 34 practical salinity units [47], and with a relatively clear water mass characterised by sediment transport estimated at <0.5 tonnes/m per day [48]. During spring tide, peak surface and bottom current velocities can exceed 1 m/s and 0.7 m/s, respectively [48]. In this region, the maximum significant offshore wave height can reach >4 m [49].
In this study, the term "stones" (also referred to as "lag deposit" and "subtidal natural hard substrate" or SNHS throughout the manuscript) is in alignment with the research conducted by Michaelis and co-workers [24,25], based on the international standard ISO14688-1:2017 [50] and it is defined as solid geogenic material with a diameter larger than 2 cm. This grain size classification system was preferred over, e.g., the Wentworth [51] scale as it distinguishes between boulders and large boulders. The specific terms refer to: "coarse gravel" (Ø 2-6 cm), hereafter referred to as "gravel", "cobble" (Ø 6-20 cm), "boulder" (Ø 20-63 cm), and "large boulder" (Ø > 63 cm). The grain size class "fine gravel" (Ø < 2 cm) was added to this standard, enhancing the grain size discrimination potential. This standard was used to derive a quantitative estimation of stone sizes from the underwater imagery.     imagery; composite image of secondary features derived from the MBES primary data (see after, section "Spatial modelling by RF"); survey extent with ground truth locations and areas of interest selected for modelling. Areas of interest selected for modelling (Roman characters) and corresponding video transects (e.g., T1) are labelled accordingly in Figure 5 (see "Results" section).

Materials and Methods
Multibeam hydroacoustic surveys and ground truth data: multibeam hydroacoustic and ground truth surveys were conducted with RV A9602 Belgica during 2018 and 2019 (Table 1). Bathymetry and backscatter data were logged using a hull-mounted Kongsberg (https://www.kongsberg.com/maritime/) EM3002 Dual System operating at a nominal frequency of 300 kHz ( Table 1). The echosounder emits 508 soundings per ping cycle. Transmit and receive units produce small (Tx and Rx) beams with widths of 1.5 • × 1.5 • . The Rx beam spacing was set in high-density equidistant geometry, and sound emitted at pulse length of 150 µs in MBES normal mode. The quality of a part of the bathymetry acquired during campaign 2018-17 was suboptimal due to a poor real-time kinetic (RTK) positioning signal and post hoc tidal-level corrections were needed. Low quality in bathymetry implies poor geometric corrections of backscatter data; [52,53] for details. Furthermore, artefacts represent a severe source of error in image analysis and predictive modelling [54,55]. Due to this, bathymetry data acquired and processed by the Agency for Maritime Services and Coast, Coastal Division, Flemish Hydrography, and acquired by a Kongsberg EM3002 dual system installed on HV Ter Streep during 2013 (Table 1), were used instead to substitute a small portion of the study area. All other surveys benefitted from RTK tidal corrections. Specific details of the MBES backscatter data processing software and pipelines can be found in [56] and [57], chapter 3. Standardised procedures were followed for backscatter data acquisition and processing, including implementing an operational routine to ensure the echosounder measurements' stability (or drift, see [19,57,58]), enabling merging backscatter data acquired at different moments in time, hence enabling efficient image analysis, e.g., [59][60][61][62]. The backscatter processing nomenclature proposed by [20] was used, and the backscatter product used in this investigation refers to fully compensated backscatter imagery (CBI), with code: "A4 B0 C0 D0 E5 F0". For further analysis, the MBES data were gridded into raster format (floating point 32-bit) files with a 1 m horizontal resolution. Table 1. Timeframe ID of the oceanographic campaigns, geophysical data type, ground truth data, and number of selected frames used for further analysis. RV = research vessel; MBES = multibeam echosounder (seafloor depth and backscatter); GT = ground truth; HG = Hamon grab; UV = underwater video; n.a. = not applicable; Rep. = replicate sample. Link to oceanographic campaign reports and course of actions available at RV Belgica web site: https://odnature.naturalsciences. be/belgica/en/(re: campaign ID). Underwater imagery acquisition, processing, and analysis. Underwater imagery (UI) was acquired by means of a video drop-frame system; a robust steel metal, winch-operated frame (http://www.vliz.be/en/videoframe-en) equipped with a medium-resolution ROS inspector colour camera (https://www.rosys.com/), linked to a MacArtney (www.macartney.com) multimedia controller unit (pan, tilt, zoom), and set up to acquire a 1 m 2 optical field of view. In this investigation, the latter camera system provided only a video live feed solution during the data acquisition phase. The UI used in this study was acquired by means of a GoPro 5 video camera (GoPro, Inc., San Mateo, CA, USA-https://gopro.com/) installed on the metal frame, downward looking (i.e., orthogonally to the seafloor ideal plane), at a height of~75 cm from the bottom, achieving on average a field of view of 0.47 m 2 (about half the footprint size of the gridded MBES data). The GoPro camera was operated in FOV-linear mode, acquiring 60 frames per second (FPS), with a resolution of 1920 × 1440 pixels. GoPro's focal length measures 16.5 mm. Using these settings, the underwater imagery acquired by the GoPro camera is able to resolve sub-centimetre objects. Two HUGYFOT (https://www.hugyfot.com/) Arius 1500 lumen underwater lights were installed on the frame and mounted orthogonally to the seafloor, reducing backscattering of light from the seston, otherwise reducing image quality. Two green laser pointers were also installed on the frame to provide a metric reference scale (10 cm apart). Positions were corrected for antenna layback adjusted to the cable breakpoint on the winch's frame. Positional uncertainty of the drop-camera system was minimised by acquiring data exclusively during slack-water windows when the current velocity reaches <0.5 knot and by ensuring minimal slack of the cable. During the acquisition of videos, the frame was kept as close as possible to the bottom (i.e., ≤1 m, achievable by controlling the altitude from the on-board monitor control unit). To improve the quality of the videos (and subsequently extracted frames), where possible, the frame was often lowered onto the ground and left recording for several seconds. A total of five video transects were acquired during the two oceanographic campaigns. Positioning was achieved by coupling the image timestamp to the vessel position provided by the On-Board Data Acquisition System (ODAS-https://odnature.naturalsciences.be/ belgica/en/odas), with a chosen acquisition rate of 1 s to have a more precise logging of the heading and geographical coordinates. For each video transect, frames were extracted every 10 s using VLC media player, resulting in a UI sample (still frame) of the seafloor every~2.5 m of drift/navigation. A total of 426 still frames were extracted from the five videos. From this initial set, blurred scenes (i.e., due to sediment plumes upon landing and/or a too high concentration of seston) and/or scenes that did not relate to the seafloor (i.e., down-and up-casting phases) were removed from the dataset, reducing it to n = 265 seafloor observations measuring on average 0.64 m 2 (median = 0.57 m 2 ) and exploitable for the extraction of quantitative parameters, as summarised and illustrated by Table 2 and Figure 3. Still frames were further processed using software ImageJ. Image contrast and illumination were manually enhanced. The pixel-to-cm scale was set against the laser pointers (10 cm apart), calibrating the images in terms of metric scale, and enabling further data analysis.
Based on the calibrated still frames, an image-annotation protocol was set up to extract numerical parameters in order to: (A) study geo-bio relations, i.e., identify the grain size fraction of interest with respect to colonisation by epilithic organisms; and (B) test RF for regression to model the distribution of the percentage cover of the identified grain size fraction of interest over broader spatial scales. To address point A, calibrated images displaying stones (n = 149) were treated in ImageJ, whereas the remaining images, with no visible stones, were assigned a value of 0%. A semi-automated routine fitting elliptic and circular patterns, referred to as regions of interest (ROIs), was set up to circumscribe the visible stones in the still images. By doing so, the Feret (or calliper) diameter (Ø) of each stone could be extracted, thus related to a grain size class (as specified in the "Study site" section). Furthermore, stones' ROIs were labelled as colonised and/or uncolonised. This selection criterion was based on the presence of at least one visible colony or individual specimen of epilithic organisms colonising the visible stone surface (e.g., Alcyonium digitatum, Alcyonidium diaphanum, Spirobranchus triqueter, Metridium senile, Flustra foliacea, Nemertesia sp.). This information was used to compute estimates of proportion of stone colonisation per grain size class and by video transect location. The ecological information obtained from point A is in turn tabulated along the X and Y coordinates of the ground truth dataset and used to address point B, with predictive spatial modelling by RF for regression.  Table 2. Investigated parameters form the dataset of calibrated underwater still frames obtained from underwater videos. Interpretation; ◊ = colonised/uncolonised. * = not explicitly used in this investigation. Count and areal measurements were standardised to an identical sample size of 1 m 2 .

Investigated Key Parameters of the Selected Dataset of Still Underwater Images Input
Parameter Scale Free text Still frame area (height × width) Numerical (m 2 ) Selection Large boulders (>63 cm) ◊ Numerical (cm Feret Ø; (1/0); area m 2 ) Boulders (20-63 cm) ◊ Cobbles (6-20 cm) ◊ Coarse gravel (2-6 cm) ◊ Fine gravel (<2 cm) ◊ Selection Community matrix * Numerical/counts; species conspicuous features Selection Substrate distribution pattern * Categorical, predefined (i.e., dense/sparse/bare) Spatial modelling strategy. In this investigation, UI is exploited to define a quantitative and ecologically relevant target response for predictive modelling, thereby ecologically meaningfully delineating SNHS habitat in the predicted maps. The modelled target response relates to the surface area of geogenic features (i.e., stones) where biological colonisation by epilithic species occurs. To that end, the standardised prospective-mapping workflow, based on UI and MBES spatial data, is set up to: (1) numerically study the biological colonisation proportion of different stone sizes (in respect to visible epilithic species); hence, to identify the grain sizes diameter (Ø) where the colonisation potential begins. Consequently, the workflow is set up to: (2) exploit the numerical information achieved by UI (i.e., the % seafloor cover of these grain sizes/m 2 ) for spatial predictive modelling using a state-of-the-art machine learning approach.
Spatial modelling by RF. RF (Breiman, 2001a and details therein) is a machine learning algorithm based on the fundamental unit of machine learning: the decision tree. The algorithm was chosen due to its widespread success in computer [63] and marine sciences applications, specifically relating to advanced ASC, e.g., [16,27,32,64,65]. Furthermore, the algorithm can be used to model continuous (i.e., numerical data), satisfying the map-making requirements of the habitat mapping methodology being tested in this investigation. RF modelling was implemented in R using part of the ModelMap package module [66] and user custom-made code snippets. The habitat-mapping methodology was tested on a spatial sub-selection of the surveyed study area (32 km 2 ), hence four 1 × 1 km squared areas, surrounding the underwater imagery and Hamon grab sample locations ( Figure 2).  Table 2. Investigated parameters form the dataset of calibrated underwater still frames obtained from underwater videos. Interpretation; ♦ = colonised/uncolonised. * = not explicitly used in this investigation. Count and areal measurements were standardised to an identical sample size of 1 m 2 .

Input
Parameter Scale Free text Still frame area (height × width) Numerical (m 2 ) Selection Large boulders (>63 cm) ♦ Numerical (cm Feret Ø; (1/0); area m 2 ) Boulders (20-63 cm) ♦ Cobbles (6-20 cm) ♦ Coarse gravel (2-6 cm) ♦ Fine gravel (<2 cm) ♦ Selection Community matrix * Numerical/counts; species conspicuous features Selection Substrate distribution pattern * Categorical, predefined (i.e., dense/sparse/bare) Spatial modelling strategy. In this investigation, UI is exploited to define a quantitative and ecologically relevant target response for predictive modelling, thereby ecologically meaningfully delineating SNHS habitat in the predicted maps. The modelled target response relates to the surface area of geogenic features (i.e., stones) where biological colonisation by epilithic species occurs. To that end, the standardised prospective-mapping workflow, based on UI and MBES spatial data, is set up to: (1) numerically study the biological colonisation proportion of different stone sizes (in respect to visible epilithic species); hence, to identify the grain sizes diameter (Ø) where the colonisation potential begins. Consequently, the workflow is set up to: (2) exploit the numerical information achieved by UI (i.e., the % seafloor cover of these grain sizes/m 2 ) for spatial predictive modelling using a state-of-the-art machine learning approach.
Spatial modelling by RF. RF (Breiman, 2001a and details therein) is a machine learning algorithm based on the fundamental unit of machine learning: the decision tree. The algorithm was chosen due to its widespread success in computer [63] and marine sciences applications, specifically relating to advanced ASC, e.g., [16,27,32,64,65]. Furthermore, the algorithm can be used to model continuous (i.e., numerical data), satisfying the mapmaking requirements of the habitat mapping methodology being tested in this investigation. RF modelling was implemented in R using part of the ModelMap package module [66] and user custom-made code snippets. The habitat-mapping methodology was tested on a spatial sub-selection of the surveyed study area (32 km 2 ), hence four 1 × 1 km squared areas, surrounding the underwater imagery and Hamon grab sample locations ( Figure 2). Within these locations, the modelling pipeline is composed of three steps: (1) feature  (3) accuracy and uncertainty assessment. The aim of the statistical modelling is to produce a map displaying the distribution of the grain size fraction of interest with respect to colonisation by epilithic fauna. As such, the overall percentage cover of the identified grain size fractions of interest (i.e., % relativised to surface area) was selected as the response variable to be modelled over the MBES spatial data.
Feature derivation and selection. A set of geomorphometric, statistical, and textural secondary features was computed from the MBES primary data. From the CBI, statistical (mean, max. min. mode, median, and standard deviation) and textural (Sobel-Feldman filters for intensity, direction, and edge R package SpatialEco, [67], grey level co-occurrence matrix homogeneity index R package GLCM [68], and Moran's I autocorrelation R package raster [69]) were derived, whereas from the bathymetry, rugosity and slope were computed. Overall, n = 15 secondary features were derived, using a 3 × 3-pixel neighbourhood window. The importance of multiscale analysis in predictive models is recognised [70], though only the smallest possible scale of derivation (i.e., window of 3 × 3) was used to ensure maximal exploitation of the high-resolution data. Secondary features were chosen based on their expected influence on the target response and on their reported importance in previous research [16]. A feature selection routine was applied to reduce the initial dataset to a set of relevant features only. The RF Boruta [71] wrapper function was used, including all features, even those that were highly correlated with a Spearman rank coefficient of 0.96 (as from correlations analysis, not shown here, and as from previous RF research [63]).
Model tuning: The RF algorithm has few parameters that require tuning and has been reported to be generally insensitive to alterations of the default values [34]. Recent applications of RF in marine sciences however indicated the importance of tuning two hyperparameters: (1) the number of decision trees grown in the forest (i.e., ntree = 500); and (2) the number of predictors used at the splits during the tree-building process (mtry). The ModelMap package [66] automatically accounts for this potential bias and select the optima with respect to the out-of-bag (OOB) error estimate (in mean squared residual-MSR). In terms of training data, 30% was withheld for validation, and training sample selection (n = 186) was randomly stratified per bins of 10%, balancing test and training sample sets.
Model accuracy was evaluated against a set of withheld samples (30%, n = 79). The following accuracy metrics were computed between observed and predicted values: mean absolute error (MAE), mean squared error (MSE), root mean squared error (RMSE), Pearson correlation coefficient (p), and Spearman correlation coefficient ( ). Both MAE and RMSE are measured in the same units as the response variable. R 2 and descriptive statistics (i.e., quantiles) are additionally used to assess model performance. Moreover, model uncertainty was accounted for by computing the coefficient of variation (CV). Being composed of several decision trees, the model's CV can be obtained by dividing the standard deviation of the predictions for each pixel by their respective predicted mean value. This spatial information was used to mask the RF model output, displaying the predictions for the pixels with an equivalent CV pixel equal to 0 (i.e., the lowest uncertainty). Further corroborating the evaluation of model accuracy, a set of Hamon grab samples (see Table 1) were used to qualitatively validate model predictions. This source of ground truth data could only be used qualitatively, validating the model predictions in a binary sense (i.e., presence/absence of stones), given the incomparability of estimates of percentages of stone cover between underwater imagery and grab samples, as well as the lack of coincident video and Hamon grab sample locations that would allow quantitative comparison. Finally, RF importance metrics (from Boruta) and partial dependence plots were used to study how selected important features influenced the prediction of the target response.

Results
Bio-Geo relations from underwater imagery. Overall, the number of delineated stones (NDS) in the UI dataset accounts for n = 9639 stones. Except for the fine gravel (NDS = 2110) that was consistently uncolonised by epilithic fauna across the five video transects, the proportions of colonised coarse gravel (NDS = 5986), cobbles (NDS = 1492), boulders (NDS = 49), and large boulders (NDS = 2) significantly differed (χ 2 statistics; degrees of freedom (D.F.) = 4; fine gravel χ 2 = 2.138; p = 0.71; coarse gravel χ 2 =22,565; p =< 0.001; cobble χ 2 = 39.716; p =< 0.001; boulder χ 2 = 200; p =< 0.001; very large boulder χ 2 = 400; p =< 0.001). Across the five transects, the proportion of colonisation ranges varies considerably across grain size classes. Besides fine gravel (i.e., 0%), colonisation of coarse gravel is considerable, ranging between 5.4 and 27.7%. Cobbles are colonised in the range 34.5-98.8%, whereas boulders and large boulders, if present, are consistently colonised (0-100%). The relationship between the average (across five video transects) proportion of colonisation with increasing stone size is reported in Figure 4. Following these observations, it was decided that for this dataset and moment in time, the grain size of interest with respect to colonisation by epilithic fauna starts at stone sizes Ø ≥ 2 cm. Thus, the areal cover cumulatively accounting for coarse gravel, cobble, boulder, and large boulder grain size classes (expressed as %/m 2 and in the range 0-46%) was selected as the parameter (target response) from underwater imagery used for spatial modelling.

Results
Bio-Geo relations from underwater imagery. Overall, the number of delineated stones (NDS) in the UI dataset accounts for n = 9639 stones. Except for the fine gravel (NDS = 2110) that was consistently uncolonised by epilithic fauna across the five video transects, the proportions of colonised coarse gravel ( Across the five transects, the proportion of colonisation ranges varies considerably across grain size classes. Besides fine gravel (i.e., 0%), colonisation of coarse gravel is considerable, ranging between 5.4 and 27.7%. Cobbles are colonised in the range 34.5-98.8%, whereas boulders and large boulders, if present, are consistently colonised (0-100%). The relationship between the average (across five video transects) proportion of colonisation with increasing stone size is reported in Figure 4. Following these observations, it was decided that for this dataset and moment in time, the grain size of interest with respect to colonisation by epilithic fauna starts at stone sizes Ø ≥ 2 cm. Thus, the areal cover cumulatively accounting for coarse gravel, cobble, boulder, and large boulder grain size classes (expressed as %/m 2 and in the range 0-46%) was selected as the parameter (target response) from underwater imagery used for spatial modelling. RF spatial predictions, accuracy, and uncertainty. Predictive performance of the selected RF model (i.e., 70% training, 500 trees, and six features randomly selected at each node) was tested, per modelled subarea, by comparisons of observed and predicted values from the whittled test set (i.e., 30%). The modelling approach indicated a good performance, quantified by multiple accuracy metrics (Table 3). Both MAE and RMSE have very low values, the adjusted R 2 has a moderately high value, and both linear and nonlinear correlation coefficients coincide with a high value, indicating the relative absence of outliers (as observed in previous studies; Gazis et al., 2018). Accordingly, the difference between observed and predicted mean and median (Q 3 ) values is very small (0.3 and 1%, RF spatial predictions, accuracy, and uncertainty. Predictive performance of the selected RF model (i.e., 70% training, 500 trees, and six features randomly selected at each node) was tested, per modelled subarea, by comparisons of observed and predicted values from the whittled test set (i.e., 30%). The modelling approach indicated a good performance, quantified by multiple accuracy metrics (Table 3). Both MAE and RMSE have very low values, the adjusted R 2 has a moderately high value, and both linear and nonlinear correlation coefficients coincide with a high value, indicating the relative absence of outliers (as observed in previous studies; Gazis et al., 2018). Accordingly, the difference between observed and predicted mean and median (Q 3 ) values is very small (0.3 and 1%, respectively). The interquartile range at intervals Q 2 and Q 5 reveals the inherent limitation of RF regression unable to predict out-of-training-range values [34], as well as over-and underestimating minimum and maximum values, respectively (as observed in similar investigations [27,72]). RF importance and spatial drivers. The analysis of the RF importance, by means of Boruta feature selection and partial dependence plots of selected features, revealed their ranked importance (not shown here), in respect to the prediction of the target response (i.e., those that explained most variance), as well as revealing their underlying nonlinear relations ( Figure 5). Of the overall fifteen features, two were discarded (3 × 3 CBI Sobel filters for direction and edge). The remainder (n = 13) features were kept as important, scoring >5 Z-scores (i.e., all CBI statistics (min. max. median, standard deviation, and mode), bathymetry and all secondary morphometric features (rugosity and slope), CBI Moran's I autocorrelation, and CBI Sobel intensity index). Expressed in terms of Z-scores by the Boruta analysis, the feature that most significantly contributed to the target prediction was the backscatter minimum value (high backscatter strength), followed by the bathymetry. All geomorphometric features (i.e., slope and rugosity), were retained as important. The partial dependence plots reveal that all the retained features contribute in a non-linear manner to the prediction of the response variable, as well as showing that, at this scale (i.e., 1 m horizontal resolution and 3 × 3 neighbourhood windows), there exist specific data intervals (e.g., mid-range depths, higher backscatter intensities, terrain with a slope <4 • , rugosity values < 0.35) promoting higher coverage of stones with a high potential of colonisation by epilithic fauna.
Remote Sens. 2021, 13, x FOR PEER REVIEW 10 of 19 respectively). The interquartile range at intervals Q 2 and Q 5 reveals the inherent limitation of RF regression unable to predict out-of-training-range values [34], as well as over-and underestimating minimum and maximum values, respectively (as observed in similar investigations [27,72]). RF importance and spatial drivers. The analysis of the RF importance, by means of Boruta feature selection and partial dependence plots of selected features, revealed their ranked importance (not shown here), in respect to the prediction of the target response (i.e., those that explained most variance), as well as revealing their underlying nonlinear relations ( Figure 5). Of the overall fifteen features, two were discarded (3 × 3 CBI Sobel filters for direction and edge). The remainder (n = 13) features were kept as important, scoring >5 Z-scores (i.e., all CBI statistics (min. max. median, standard deviation, and mode), bathymetry and all secondary morphometric features (rugosity and slope), CBI Moran's I autocorrelation, and CBI Sobel intensity index). Expressed in terms of Z-scores by the Boruta analysis, the feature that most significantly contributed to the target prediction was the backscatter minimum value (high backscatter strength), followed by the bathymetry. All geomorphometric features (i.e., slope and rugosity), were retained as important. The partial dependence plots reveal that all the retained features contribute in a non-linear manner to the prediction of the response variable, as well as showing that, at this scale (i.e., 1 m horizontal resolution and 3 × 3 neighbourhood windows), there exist specific data intervals (e.g., mid-range depths, higher backscatter intensities, terrain with a slope < 4°, rugosity values < 0.35) promoting higher coverage of stones with a high potential of colonisation by epilithic fauna.  Predicted spatial distribution of subtidal natural hard substrates associated with epilithic organisms. The first row of Figure 6 displays the spatial predictions (range 0-34%) of the natural hard substrates identified as having a high potential of colonisation by epilithic fauna. These modelled predictions capture the morpho-sedimentological continuum of the seafloor/habitat and show that the distribution of hard substrates (expressed as stone coverage %/m 2 ) coincides with two predominant spatial configurations (i.e., seascape patterns): (1) relatively large-scale striations (here aligned with the main tidal axis SW-NE and within depressions where the current channelises); and (2) small-scale elliptical shapes, observed under the form of relatively small-scale troughs/depressions. Striations develop longitudinally in the orders of kilometres, forming a stripy pattern with narrow (200-300 m) bands, alternated by sand ribbons. Small-scale elliptical shapes (patches) range in Ø 10-25 m and alternate with small to medium-sized sand ripple morphologies ranging in height 1-2 m and wavelength (λ) 10-20 m (as apparent from the hill shade semi-transparent map highlighting the geomorphology). Both spatial configurations are populated by relatively high percentages of stone cover, with values in the range 10-34% cover.
Complementing the model predictions, the coefficient of variation (mid-row Figure 6) displays spatial uncertainty. This information provides a spatially explicit understanding of the modelled areas' uncertainty, displaying the by-pixel algorithm's consistency in allocating predicted values. This information was used to spatially constrain the model predictions (bottom row, Figure 6) where the CV was lowest, i.e., with a value of 0. The resulting map captures the spatially explicit patchy distribution of the natural hard substrate and removes the minimum overestimated values coincident with the hill shade sand signatures. Finally, the Hamon grab samples (blue outlined triangles, bottom row pictures, Figure 6), qualitatively corroborate the model predictions, in a binary sense, displaying samples with a high content of hard substrate, correctly allocated to pixels of striations and small-scale elliptical patches (blue tail of the colour palette), whereas predominantly sandy samples are correctly allocated outside these zones, thus on areas predicted in the range 0-8% (red tail of the colour palette, first row, Figure 6; dark black areas, third row, Figure 6).
Predicted spatial distribution of subtidal natural hard substrates associated with epilithic organisms. The first row of Figure 6 displays the spatial predictions (range 0-34%) of the natural hard substrates identified as having a high potential of colonisation by epilithic fauna. These modelled predictions capture the morpho-sedimentological continuum of the seafloor/habitat and show that the distribution of hard substrates (expressed as stone coverage %/m 2 ) coincides with two predominant spatial configurations (i.e., seascape patterns): (1) relatively large-scale striations (here aligned with the main tidal axis SW-NE and within depressions where the current channelises); and (2) small-scale elliptical shapes, observed under the form of relatively small-scale troughs/depressions. Striations develop longitudinally in the orders of kilometres, forming a stripy pattern with narrow (200-300 m) bands, alternated by sand ribbons. Small-scale elliptical shapes (patches) range in Ø 10-25 m and alternate with small to medium-sized sand ripple morphologies ranging in height 1-2 m and wavelength (λ) 10-20 m (as apparent from the hill shade semi-transparent map highlighting the geomorphology). Both spatial configurations are populated by relatively high percentages of stone cover, with values in the range 10-34% cover.
Complementing the model predictions, the coefficient of variation (mid-row Figure  6) displays spatial uncertainty. This information provides a spatially explicit understanding of the modelled areas' uncertainty, displaying the by-pixel algorithm's consistency in allocating predicted values. This information was used to spatially constrain the model predictions (bottom row, Figure 6) where the CV was lowest, i.e., with a value of 0. The resulting map captures the spatially explicit patchy distribution of the natural hard substrate and removes the minimum overestimated values coincident with the hill shade sand signatures. Finally, the Hamon grab samples (blue outlined triangles, bottom row pictures, Figure 6), qualitatively corroborate the model predictions, in a binary sense, displaying samples with a high content of hard substrate, correctly allocated to pixels of striations and small-scale elliptical patches (blue tail of the colour palette), whereas predominantly sandy samples are correctly allocated outside these zones, thus on areas predicted in the range 0-8% (red tail of the colour palette, first row, Figure 6; dark black areas, third row, Figure 6). The laser points are 10 cm apart (frame randomly picked from 30% test dataset). Still frames T2 and T4 are from "dense stone lag shapes" whereas T5 is "bare sandy bottom". Conspicuous epilithic species in the UI, noticeably colonies of A. digitatum and dense bioencrustations by S. triqueter.

Discussion
The investigation focused on the scientific validation of a transferrable and quantitative (i.e., continuous data prediction) habitat mapping routine, suited to the spatially explicit demarcation of subtidal natural hard substrates environments (i.e., in the offshore North Sea setting) with high potential of colonisation by epilithic fauna. The proposed methodology exploited a combination of contemporary and non-destructive (minimally invasive) remote sensing (ship-based MBES soundings) and ground truthing technologies (UI by drop frame), interlinked by a state-of-the-art machine learning predictive algorithm (i.e., using RF for regression). The innovative character of the habitat mapping approach tested in this investigation is the quantitative exploitation of the UI for the development of an ecologically informed SNHS habitat delineation criterion, aiding the spatial delineation of the realised niche of reef-associated epilithic organisms. The following paragraphs discuss the success of this investigation, focusing on the exploitation of UI for a quantitative appraisal of SNHS habitat and the potential of this data for ecologically informed habitat modelling. Lastly, shortcomings and future outlooks are highlighted.
In line with recent applications of UI for the characterisation of SNHS and associated epibenthic assemblages in the English Channel and Southern North sea marine regions setting, e.g., [21,[24][25][26]37,73], seafloor imaging by drop/drift frames is confirmed as a highly time efficient (i.e., number of samples vs. sampling time) approach, providing spatially representative data of interest for the characterisation of high-frequency (in respect to the acoustical signal penetration into the seafloor; see [18,74]) and high-resolution (in Figure 6. RF spatial predictions within the subareas selected for modelling. Locations shown in Figure 2. First row: predicted % areal cover of stones Ø ≥ 2 cm/m 2 (ISO 14688-1:2017 grain size classes) within quadrants I, II, III, and IV. The CBI is displayed beneath top and mid rows. BS: backscatter strength. Second row: coefficient of variation of the predictions. Third row: masked model predictions where the CV == 0, along with pictures of coincident Hamon grab samples and validation still frames from underwater videos (note video transect label for relative location on the maps). A semi-transparent hill shade layer is superimposed with 3 × vertical exaggeration for a prospective view of the surrounding geomorphology and the map colour palettes are displayed using quantile breaks. Fourth row: pictures of coincident Hamon grab samples. Fifth row: coincident underwater imagery. The laser points are 10 cm apart (frame randomly picked from 30% test dataset). Still frames T2 and T4 are from "dense stone lag shapes" whereas T5 is "bare sandy bottom". Conspicuous epilithic species in the UI, noticeably colonies of A. digitatum and dense bioencrustations by S. triqueter.

Discussion
The investigation focused on the scientific validation of a transferrable and quantitative (i.e., continuous data prediction) habitat mapping routine, suited to the spatially explicit demarcation of subtidal natural hard substrates environments (i.e., in the offshore North Sea setting) with high potential of colonisation by epilithic fauna. The proposed methodology exploited a combination of contemporary and non-destructive (minimally invasive) remote sensing (ship-based MBES soundings) and ground truthing technologies (UI by drop frame), interlinked by a state-of-the-art machine learning predictive algorithm (i.e., using RF for regression). The innovative character of the habitat mapping approach tested in this investigation is the quantitative exploitation of the UI for the development of an ecologically informed SNHS habitat delineation criterion, aiding the spatial delineation of the realised niche of reef-associated epilithic organisms. The following paragraphs discuss the success of this investigation, focusing on the exploitation of UI for a quantitative appraisal of SNHS habitat and the potential of this data for ecologically informed habitat modelling. Lastly, shortcomings and future outlooks are highlighted.
In line with recent applications of UI for the characterisation of SNHS and associated epibenthic assemblages in the English Channel and Southern North sea marine regions setting, e.g., [21,[24][25][26]37,73], seafloor imaging by drop/drift frames is confirmed as a highly time efficient (i.e., number of samples vs. sampling time) approach, providing spatially representative data of interest for the characterisation of high-frequency (in respect to the acoustical signal penetration into the seafloor; see [18,74]) and high-resolution (in respect to the sample size match between remote sensing and ground truthing data) MBES data. The UI observations revealed comprehensive details of the seafloor nature: substrate, biota, and their spatial organisation and structure (size of visible biotic and abiotic occurrences) and provided insight into sedimentary processes and geo-bio relations. Based on such data, the investigation showed how UI enables deriving an ecologically relevant spatial delineation criterion to map SNHS habitat. Firstly, the proportion of biologically colonised substrate could be related to specific grain size fractions (starting from stones Ø > 2 cm). Consequently, UI provided the quantitative information needed to set up predictive seafloor modelling, in this case producing maps capturing the percent cover (%/m 2 ) of the grain size fractions with the highest potential of colonisation by epilithic fauna. Thus, UI satisfies both a numerical ecological appreciation of bio-geo relations and the production of a dataset enabling ecologically informed seafloor substrate modelling. Consequently, the seafloor model herein produced may not be considered as a mere substrate map (i.e., abiotic map), rather, as a benthic habitat map [13] specifically related to SNHS assemblages of epilithic fauna. Here it must be mentioned that including environmentally dynamic variables (i.e., hydrodynamics) at relevant scales in the modelling process [75] would inevitably enhance the ecological significance of the modelled map outputs, including a refined and mechanistic understanding of the fine-scale driving factors of SNHS communities. Interestingly, if such seafloor predictions are repeated in time (i.e., by repeating the data acquisition and processing workflow herein tested, on datasets acquired, e.g., during different seasons), spatiotemporal and high-resolution changes in the areas of potential epibenthic colonisation may further help elucidate environmental drivers and temporal dependencies of such benthic communities. Indeed, considerations of the fourth dimension (i.e., time) in BHM studies is fundamental for the interpretation of static maps (i.e., "snapshot-in-time") and of the UI data and derived observations. With regard to the latter, it must be noted that mechanisms of seafloor disturbance (both natural and/or anthropogenic, e.g., periodic stone burial and/or abrasion by bottom fishing gears), may cause a bias to the biological observations (e.g., colonisation). Therefore, it is important to consider that the resulting datasets and resulting spatial models may have a strong dependence on the local seafloor spatiotemporal dynamicity over timescales ranging from seconds to years.
With regard to spatial modelling, as for other recent investigations, e.g., [27,[35][36][37]63] the use of RF is confirmed as a well-performing (accurate) and versatile predictive algorithm. From a relatively limited number of loci, RF exploits the multivariate information, including highly correlated features, while avoiding overfitting [34]. RF learns complex underlying non-linear relations (e.g., Figure 5, partial dependence plots in this investigation) between explanatory features (i.e., the hydroacoustic data) and the target response (derived from UI), predicting fitted relations on the broader "unseen" surveyed area and revealing trends and patterns that may reveal casualties and even be used to formulate novel scientific hypotheses [76]. Using RF for regression produces output models that are interpretable as a seafloor morpho-sedimentological continuum. Unlike predictive modelling of habitat classification schemes [77], continuous spatial predictions provide finer ecological information (e.g., gradients [36,78]) and enable better exploitation of high-resolution hydroacoustic data, leading to a better depiction of seafloor heterogeneity, evidencing fine-scale variations that may otherwise remain unnoticed where a single categorical/thematic class is used instead. Further benefits of RF for regression and for the case of modelling spatially clustered and heterogeneous seafloor features, such as SNHS, is the randomisation of the training data points in each decision tree. This leads to a new set of training data points informing the model's learning process, and since the spatial selection is ignored by the random selection process, the influence of spatial autocorrelation is limited, making it an appropriate modelling approach for spatially clustered data (e.g., mapping of polymetallic nodules in [27]). Furthermore, RF benefits from a relatively simple model-tuning, and does not rely on any prior transformation/standardisation of the input data, or on any a priori assumption/requirement of the data structure (e.g., normality; [34]). Finally, at the cost of increasing the computational time, a favourable characteristic of RF for regression is the potential of deriving spatially coincident measures of uncertainty for the model predictions, such as the CV herein used, providing interesting insights into the algorithm decisions consistency in allocating the learnt predicted values to the unseen data. Moreover, coincident uncertainty measures enabled spatial subselection of the model predictions, masking overestimated values (i.e., those predicted on obviously sandy areas; cfr. hill shade prospective view in Figure 5), and explicitly revealing the two main spatial distribution patterns of the modelled response: distributed as striations and elliptical geometries of hard substrate. Striations are indicative of a strong tidal stream and the presence of hard substrates [79] and were also observed by earlier studies in nearby study sites (e.g., English Channel [80]; Belgian waters, [81,82]). Holme and Wilson [80] described such striations as "longitudinal tide swept furrows" associated with rich epibenthic communities. For the study site assessed in this investigation, this observation is confirmed by the UI (e.g., Figure 5), displaying a typical SNHS fauna, including long-lived (i.e., >5 yrs.), vertically developing, and arborescent sessile epilithic species. Indeed, knowledge of the spatially explicit distribution of the stone's coverage (i.e., by seafloor units of 1 m 2 ) is an invaluable asset to biological data survey planning, as well as providing the baseline for environmental monitoring at ecologically relevant scales. Recent studies, including the present investigation, demonstrated that stone size is a fundamental driver of SNHS epibenthic (epilithic) species assemblages [24,25]. Furthermore, besides the size of individual stones, seascape patterns, such as the striations and elliptical shapes herein identified, can be important proxies of biological communities (describing the relation between geological and biological structure, e.g., [83,84]. Therefore, at such scales, the integration of MBES and UI by RF for regression largely improves the identification of the ecology of such seascape patterns. Hitherto, only a few cases have been published showing the importance of geological complexity or geodiversity in marine management, e.g., [84,85], but clearly if these coincide with hotspots of biodiversity, a new era is emerging in the mapping of SNHS. This is indeed the case for the Southern North Sea region, having a complex geological history and a diverse morphosedimentary setting. For example, the results of this research (including the interpretation of the UI-derived community matrix; not shown here) may suggest that all SNHS striations and elliptical shapes in this specific environmental setting may be considered as priority areas for conservation and monitoring, hosting high-priority SNHS biological assemblages.
While the instrumentation and set-up used in the present investigation proved highly valuable for targeting the research aim, some improvements could be made at both the UI data acquisition and processing and the predictive modelling phases. Hereafter, shortcomings are flagged. Technical improvements are needed in the design of the UI acquisition platform, including increased illumination allowing one to position the recording camera higher above the seafloor to naturally achieve a viewing angle closely matching the resolution of the MBES data (i.e., 1 m). Although videos were acquired under optimal sea state conditions and tidal windows, and extracted image stills were well positioned (i.e., georeferenced to the drop-frame cable's breakpoint), installation of a dedicated positioning system would represent a considerable positional improvement and decrease the chances of positional error propagation in the modelling phase. Analysis of underwater imagery can be highly time-consuming given the high number of data points compared to conventional sampling gears. While the extraction of parameters (i.e., such as the stone's Ø and areal cover) from still images was mostly automated, the annotation, i.e., segmentation into ROIs and labelling into colonised and/or uncolonised, was performed manually. The success of fully automated image-annotation applications [86] is inextricably linked to the quality of the underwater imagery, in turn dictated by the specific set-up of the various components on the UI acquisition platform, by the quality and functioning of the components themselves, and by the environmental conditions at the time of UI acquisition. Thus, it is expected that considerable improvements in the use of such technology for seafloor ecological characterisation and quantitative predictive modelling will come in from R&Doriented technical improvements of the UI acquisition platform itself, as well as from the field of computer science and big data, in the application of machine and deep learning algorithms to the automated annotation of UI [87], as well as image processing, correction, and filtering (e.g., during data acquisition, minimising a posteriori labour-intensive aspects of image analysis).
Specifically relating to relatively turbid water environment, development of "clearwater housing" underwater optical systems [88] will inevitably benefit future SNHS research. Regarding the RF predictive modelling, as reported by others [27], the most significant improvements in predictive accuracy and performance result from the use of more training data points.
The availability of more data, especially if they are better distributed (i.e., data that will include the entire range of the target response, for example from all the different topographies and acoustical facies), would inevitably reinforce the model in learning improved and wider relationships between the predictor and response variables, while also ensuring a larger number of validation data points to derive accuracy metrics from. Here, use of towed sledge systems and or/autonomous underwater vehicles (AUVs) represent an invaluable asset to the collection of larger volumes of UI in complement to the MBES data [89]. In this regard, an important aspect of machine learning application to habitat mapping studies will be to address the effect of the amount of training data on model performance. This aspect remains poorly explored in this domain and is at the centre of discussion of recent research, e.g., [27]. This aspect will be particularly important in view of upscaling the habitat mapping approach tested in this investigation and incorporating it into standards designed to specifically survey and characterise SNHS habitat. It will be important to define quantitatively the amount of UI groundtruth data to be collected, in respect to the extent of the hydroacoustic survey. A sufficient underlying number of observations is needed for a representative model to be constructed. In this investigation, modelling tests (not shown here) were conducted to predict the target response over the entire 32 km 2 surveyed area. However, these produced poorer accuracies, evidencing the relative paucity of UI observations available for this study. While a good practice approach is to check, prior to the modelling phase, the representativeness of the training and testing groundtruth and underlying MBES datasets (i.e., by comparing the probability density functions from the primary MBES data prediction surface and sample locations; not shown), further tests are required to elucidate this aspect further, particularly in view of the large volumes of MBES data expected to become available in the coming years [90]. Finally, future tests based on this predictive modelling approach may benefit from the use of RF for quantile regression, having the potential to improve the uncertainty reporting by exploiting lower and upper quartiles as confidence around the mean predictions [66].

Conclusions
The results of this investigation show that the acquisition, processing, and analysis of underwater seafloor imagery data provide seafloor mappers with quantitative and ecologically relevant information on the distribution of subtidal natural hard substrates. This numerical information can be integrated with ship-based MBES data by means of RF machine learning, here used to compute high-resolution predictions of stone areal coverage (%/1 m 2 ) at scales relevant for ecological management. Such spatial predictions provide a direct link to the distribution of SNHS epibenthic assemblages that are a priority for environmental conservation and monitoring in European seas and elsewhere. Furthermore, this investigation demonstrated how the use of underwater imagery enabled understanding of how stone size influences biological colonisation and, consequently, how this information can be quantitatively exploited to develop ecologically meaningful habitat delineation criteria.