Legacy Data: How Decades of Seabed Sampling Can Produce Robust Predictions and Versatile Products

: Sediment maps developed from categorical data are widely applied to support marine spatial planning across various ﬁelds. However, deriving maps independently of sediment classiﬁcation potentially improves our understanding of environmental gradients and reduces issues of harmonising data across jurisdictional boundaries. As the groundtruth samples are often measured for the fractions of mud, sand and gravel, this data can be utilised more e ﬀ ectively to produce quantitative maps of sediment composition. Using harmonised data products from a range of sources including the European Marine Observation and Data Network (EMODnet), spatial predictions of these three sediment fractions were generated for the north-west European continental shelf using the random forest algorithm. Once modelled these sediment fraction maps were classiﬁed using a range of schemes to show the versatility of such an approach, and spatial accuracy maps were generated to support their interpretation. The maps produced in this study are to date the highest resolution quantitative sediment composition maps that have been produced for a study area of this extent and are likely to be of interest for a wide range of applications such as ecological and biophysical studies.


Introduction
Continental shelf seas cover only ≈9% of the global seafloor [1], but are biologically productive, important for biogeochemical cycling [2], provide a wide range of resources and services to humanity, while at the same time experiencing increased human impacts [3]. Although these shallow seas (water depths generally less than 200 m) are relatively well researched, there is still a lack of detailed maps of seafloor sediments, substrates, habitats or even bathymetry. In Europe, the European Marine Observation and Data Network (EMODnet) was incepted in 2009 to provide a gateway to marine data across seven discipline-based themes, including geology. The EMODnet-Geology theme aims at providing harmonised information on marine geology in Europe. One of the central products is a seabed substrate map of European maritime areas [4]. This map was compiled by harmonising substrate information from more than 30 countries and consolidating the data into a single map product with three unified substrate classification schemes based on a modification of the Folk classification [5]. The Folk scheme classifies the sediment based on the sediment fractions of mud (grain size d < 63 µm), sand (63 µm ≤ d < 2 mm) and gravel (d ≥ 2 mm). Simplifying this scheme from the original 15 sediment classes to six and four sediment classes has allowed harmonisation of the European seabed substrate data into a unified substrate map. However, some differences cannot be resolved by nesting multiple classes within a broader class. For example, Folk [5] suggested a trace amount of gravel could be 0.01%, whereas in the current interpretation of "trace" applied by the British Geological Survey (BGS), a fraction of 1% gravel is applied [6]. Another example in the difference of sediment classification schemes has been the interpretation of the boundary between muddy sand and sand. The original definition of this boundary was based on a 9:1 ratio of sand to mud. This definition has been widely used, for example this is the approach taken by the EMODnet-Geology harmonised seabed substrate map. However, there have been other variations of this approach such as the BGS modified Folk diagram described in Long [6]. Here the sediments with less than 5% gravel are separated at a 4:1 ratio of sand to mud into the classes sand and muddy sand and Mud and sandy mud. This BGS modified Folk diagram has been widely adopted within the United Kingdom (UK), particularly in projects such as the UK Marine Protected Areas Programme [7].
While there are issues with comparing or harmonising maps derived from different classification schemes, the sediment data used to generate these maps often has more information about sediment grain size than simply class type. Half phi (ϕ) grain size distribution or the sediment fractions of mud, sand and gravel are commonly measured but rarely used for mapping. Information is lost in the process of simplifying the relative abundance of the grain size components to a sediment class. For example, within the Folk triangle the percentages of mud, sand and gravel vary within a sediment class by between 1% and 50% depending on the specific class [5]. While this may be acceptable for certain applications, benthic species assemblages do not fit neatly into different sediment classes [8] and these classes would not necessarily be appropriate to inform certain human activities (e.g., engineering work or aggregate extraction). Therefore, given the limitations of classified substrate maps, there is a need for alternative approaches, for example to relate to species occurrence data.
Recently, methods have been developed to produce quantitative sediment maps that make better use of the quantitative grain size data. Lark et al. [9] derived additive log-ratios of the three sediment fractions that could then be modelled across the UK continental shelf. These additive log-ratios were subsequently converted back into the relative sediment fractions thereby predicting the distribution of mud, sand and gravel at the seabed. This geostatistical approach also allowed the authors to express the local probability of each class. While those authors used cokriging, Stephens and Diesing [10] spatially predicted sediment composition with the Random Forest [11] algorithm based on additive log-ratios. Diesing [12] further highlighted the value of quantitative sediment maps by applying a similar method at a fine scale (10 m resolution) to predict sediment fractions across a site of approximately 15,000 km 2 . The layers produced by these models have already proved valuable in other work such as predicting the spatial distribution of organic carbon in surficial shelf sediments [13], quantifying and valuing organic carbon flows and stocks on the UK continental shelf [14], understanding variation in benthic pH gradients [15] and assessing North Sea demersal fisheries in relation to benthic habitats [16]. To further support this desire by scientists for regional continuous variables that are suitable for a range of applications, Wilson et al. [17] generated a range of layers including mud, sand and gravel fractions for the north-west European shelf. However, the resolution of these data was coarse at a spatial resolution of 0.125 • by 0.125 • (approximately 8 km by 13 km, although this varies with latitude), and the methodology considered each sediment component in isolation, which is not suitable for compositional data of this type [18].
In line with the efforts of EMODnet to unify outputs across Europe, and to apply state-of-the-art methods for modelling the distribution of sediments this study investigates the application of quantitative sediment composition models at the scale of a European sea-basin. Stephens and Diesing [10] developed these techniques to predict the fractions of the three sediment components of mud, sand and gravel for an area of UK and North Sea. While this approach was largely successful, producing an overall accuracy of 0.83, this initial study was limited in geographic extent and spatial resolution (500 m). Further, when considering the high groundtruth sample density in certain areas of their study area, such as the North Sea, it is likely that some pixels were attributed to more than one sample. As samples were randomly separated into training and testing datasets, this could have had the effect of inflating the reported accuracy.
Stephens and Diesing [10] also calculated prediction intervals to quantify the reliability of the predictions. However, as these related to the two additive log ratios, they remained somewhat difficult to interpret. Prediction error is likely to be concentrated in certain regions of a map, such as in areas of high complexity [19,20] or around poorly sampled features [21]. A number of papers have presented methods for representing the spatial distribution of map error [20,22], and incorporating these types of maps have been advocated elsewhere [23,24]. Here we present spatial accuracy maps to accompany the updated substrate maps, which will support the product's use in future studies. The presented methodology draws upon the work of Comber et al. [22] by applying a weighting function to understand how accuracy varies across the study site. The objectives of this study are therefore to provide high-resolution (7.5 arc seconds or approximately 130 m by 230 m) spatial models of sediment composition in continuous and classified form, accompanied with maps of spatially-explicit map accuracy/error and covering large parts of the north-west European continental shelf.

Study Area
The study area focusses on the north-west European continental shelf and includes the North Sea, Irish Sea, Celtic Sea, English Channel and Skagerrak (Figure 1a). This includes areas within the national maritime boundaries of Belgium, Denmark, France, Germany, Netherlands, Norway, Republic of Ireland, Sweden and the United Kingdom and Channel Islands. Stephens and Diesing [10] also calculated prediction intervals to quantify the reliability of the predictions. However, as these related to the two additive log ratios, they remained somewhat difficult to interpret. Prediction error is likely to be concentrated in certain regions of a map, such as in areas of high complexity [19,20] or around poorly sampled features [21]. A number of papers have presented methods for representing the spatial distribution of map error [20,22], and incorporating these types of maps have been advocated elsewhere [23,24]. Here we present spatial accuracy maps to accompany the updated substrate maps, which will support the product's use in future studies. The presented methodology draws upon the work of Comber et al. [22] by applying a weighting function to understand how accuracy varies across the study site. The objectives of this study are therefore to provide high-resolution (7.5 arc seconds or approximately 130 m by 230 m) spatial models of sediment composition in continuous and classified form, accompanied with maps of spatially-explicit map accuracy/error and covering large parts of the north-west European continental shelf.

Study Area
The study area focusses on the north-west European continental shelf and includes the North Sea, Irish Sea, Celtic Sea, English Channel and Skagerrak ( Figure 1a). This includes areas within the national maritime boundaries of Belgium, Denmark, France, Germany, Netherlands, Norway, Republic of Ireland, Sweden and the United Kingdom and Channel Islands.  Table 1 for additional details.  Table 1 for additional details.

Substrate Observations
Seabed samples were collated from several sources including national marine and geological institutes (Supplement S1). Some of these sources contained duplicate samples, but once these were removed the sample data downloaded from various sources consisted of approximately 68,000 samples where particle size distribution data existed. However, it was necessary to filter the data to remove potentially problematic samples, such as those where the reported sum of the percentages of mud, sand and gravel did not equal 100%. Samples collected prior to 1990 were also discarded, as these records may have imprecise positioning prior to the adoption of Global Positioning System. Inadequately recorded metadata meant that for many samples there was no information about how the grain size percentages were measured. Based on the disproportionate number of samples that were recorded with grain size percentages that were in round numbers (such as 50% sand/50% mud or 25% gravel/75% sand) it is likely that the fractions reported for these samples were estimated rather than analysed quantitatively. Commonly occurring fractions that were suspected of not being quantitatively measured were also discarded from analysis.
The density of samples varied considerably across the study site. As would be expected, areas near the coast were typically sampled at a higher density than those in deep environments and near the edge of the continental shelf. In areas of high sample density, it was common to have more than one sediment sample per unit of analysis (i.e., the spatial resolution of pixels used as predictor variables). Where this occurred, an average of the percentages of mud, sand and gravel was calculated to produce one set of fractions that was representative of that pixel. This resulted in a total sediment sample dataset of 45,761 samples.
The mud, sand and gravel fractions are compositional data, i.e., the sum of these fractions must equal 1 (or 100%) with each fraction constrained between 0 and 1. Therefore each component should not be considered in isolation from the others. We follow the recommendations of Aitchinson [18] and transform the data onto the additive log-ratio (ALR) scale where they can be analysed as two continuous, unconstrained response variables which assume any value. ALR tranformations are undefined if any observed value is zero. Therefore, we used the same rationale as Lark et al. [9], and all zero fractions were changed to the lowest observed fraction in the groundtruth data (0.01). Here we have selected to use the gravel fraction as the denominator of the log ratio, but it should be noted that the choice of variable does not affect the final outcome of the analyses [25].
The two additive log-ratios alr m and alr s constitute two response variables and separate predictive models can be built for each individually.
The data were then split randomly into training and testing datasets based on a 67/33% split (i.e., 30,480 training observations and 15,281 testing observations).

Predictor Variables
Variables used in the model are summarised in Table 1 and displayed in Figure 1a-h. Predictor variables were informed by Stephens and Diesing [10] and selected based on what was observed to be important for explaining the distribution of sediments. . The EMODnet-bathymetry is available in the World Geodetic System 1984 and has a gridsize of 1/8 arc minutes * 1/8 arc minutes (equal to 7.5 arc seconds). This equates to approximately 155 m * 230 m (x * y) in the south of the study area and 116 m * 230 m in the north of the study area. All other predictor variables (see below) were resampled onto this same 7.5 arc seconds grid. Bathymetric position indices [27] were calculated from the bathymetry DTM at two neighbourhood sizes that were thought to capture the local and regional variation and were sufficiently distinct to have limited correlation.
Two components of the hydrodynamic regime acting on the seabed were modelled and included for analysis. These were the average current speed and the wave peak orbital velocity at the seabed. Current speeds were derived from a purpose-built TELEMAC2D model with a mesh spacing ranging from 0.5 km to 10.0 km depending on the proximity to coast. This was then interpolated to the same 7.5 arc seconds grid as the other data layers. Peak orbital velocity of waves at the seabed were derived from a European continental shelf model of peak wave height and period from 2001-2010. This was based on a grid spacing of approximately 11 km. Using the method of Soulsby [28], this peak wave height and period were combined with depth and interpolated to the bathymetric grid (further details for current speed and peak wave velocity in Supplement S2 in the supplementary material).
Suspended inorganic particulate matter [29] derived from satellite imagery were downloaded from the Copernicus marine portal (OCEANCOLOUR_GLO_OPTICS_L4_REP_OBSERVATIONS_009_081). Data were downloaded at a 4 km resolution as monthly averages between January 2003 and December 2017. Data were averaged across the 15 years for the summer months (June, July and August) and winter months (December, January and February) to produce two separate rasters. Pixels obscured by cloud cover were ignored from analysis so there is the potential for some bias introduced as turbidity may be associated with increased cloud cover and therefore underrepresented in the dataset. Values represent g/m 3 . Rasters were then interpolated to the same 7.5 arc seconds grid as the other data layers.
Euclidean distance to coast was calculated in ArcGIS and was expected to be an indicator of distance to sediment source. Stephens and Diesing [10] observed the importance of this layer as a predictor variable.

Modelling
The random forest prediction algorithm [11] was selected as the model for this analysis as it showed a high level of predictive accuracy in similar studies [10,12], and is commonly applied to various modelling domains [30,31]. Random forests can be used without extensive parameter tuning, can handle many predictor variables and are insensitive to the inclusion of noisy or irrelevant features. Random forest models were implemented with the randomForest package [32] in R [33]. Forests had 500 trees and all other model parameters were kept as default.
Fitted models were applied to predict two response variables (alr m and alr s ) as rasters, based on the available predictor variables. To generate the raster predictions of the three sediment fractions (mud, sand and gravel), the two response variables were back-transformed using the additive log-ratios: Using the abundance of the three sediment fractions, any classification scheme based on mud, sand and gravel fractions can be applied to create a classified map. These include commonly used schemes such as the Folk 5, Folk 7 and Folk 16 classes (where the number reflects the total number of classes in the classification scheme including one class for hard substrate) used in EMODnet Geology [34] and the EUNIS Level 3 classification for broadscale sedimentary habitats based on the simplified Folk triangle [6].

Model Validation
The random forest algorithm implicitly carries out a form of cross-validation using the 'out-of-bag' (OOB) observations (i.e., the observations not included in each tree). In addition, the models are validated against the test set of observations. The performance is assessed by calculating the mean of the squared prediction error: where y are observed andŷ are predicted values. The 'variance explained' (VE) by the model is then calculated by taking the ratio of the MSE to the variance (σ 2 ) of the observed values: Using the test sample data set, classification accuracy was measured based on the predicted sediment type versus the original sediment type using the EUNIS Level 3 broadscale sediment classes, Folk 5 and Folk 16 maps. From this the overall accuracy, user's and producer's accuracy were calculated from the confusion matrix.
To represent the spatial distribution of error for each of the three sediment fractions the local Root-Mean-Squared-Error (RMSE) was calculated across the site. To do this, the squared error of each test sample was calculated based on the difference between the observed and predicted sediment fraction. A smoothed surface of local RMSE was then generated using the Inverse Distance Weighted (IDW) technique in ArcGIS. Each pixels' RMSE was determined based on the closest 50 points (up to a maximum distance of 200 km). A weighting power function was applied in the IDW tool (set at 0.3) so nearer points contributed more to the pixel than distant points. The number and maximum distance were selected to produce an error map that had full spatial coverage but was locally constrained where sufficient samples were present. This IDW function was applied using a 1000 m * 1000 m grid to simplify computer processing.
For the classified predictions spatial accuracy was calculated using a locally constrained confusion matrix. Here test samples were converted to a Boolean value based on whether they were correctly classified. The IDW technique was applied to calculate a local thematic accuracy value. As above, this was applied based on the closest 50 points (maximum distance of 200 km) with a weighting power function of 0.3. This IDW function was applied using a 1000 m * 1000 m grid to simplify computer processing.
A comparison of classified model outputs with the sediment classification map from Stephens and Diesing [10] was also performed. As the maps had different extents and resolutions, a fishnet grid of 1000 * 1000 points was overlaid on the study side. Points outside the shared map extent or over land were removed and the prediction from both maps were extracted at the location of each remaining point. The results of the comparison were reported as a confusion matrix and overall agreement between maps was also calculated. Both maps would contain error, so the purpose of a comparison was to understand to what degree the changes to input data affected the final predictions.

Features Importance
For both the alr m and alr s models the most important predictor variables were peak orbital wave velocity and mean tidal currents ( Figure 2). All other variables were observed to contribute to the model, however, the relative importance changed for the alr m and alr s log-ratios. constrained where sufficient samples were present. This IDW function was applied using a 1000 m * 1000 m grid to simplify computer processing. For the classified predictions spatial accuracy was calculated using a locally constrained confusion matrix. Here test samples were converted to a Boolean value based on whether they were correctly classified. The IDW technique was applied to calculate a local thematic accuracy value. As above, this was applied based on the closest 50 points (maximum distance of 200 km) with a weighting power function of 0.3. This IDW function was applied using a 1000 m * 1000 m grid to simplify computer processing.
A comparison of classified model outputs with the sediment classification map from Stephens and Diesing [10] was also performed. As the maps had different extents and resolutions, a fishnet grid of 1000 * 1000 points was overlaid on the study side. Points outside the shared map extent or over land were removed and the prediction from both maps were extracted at the location of each remaining point. The results of the comparison were reported as a confusion matrix and overall agreement between maps was also calculated. Both maps would contain error, so the purpose of a comparison was to understand to what degree the changes to input data affected the final predictions.

Features Importance
For both the alrm and alrs models the most important predictor variables were peak orbital wave velocity and mean tidal currents ( Figure 2). All other variables were observed to contribute to the model, however, the relative importance changed for the alrm and alrs log-ratios.

Model Validation
The model validation statistics (Table 2) indicate that the variance explained by the predictive models were approximately 63% for alrm and 68% for alrs. Figure 3 shows the observed versus predicted values for a random subset of the test samples for alrm and alrs. The plots show that there is considerable variation that was not explained by the models.

Model Validation
The model validation statistics (Table 2) indicate that the variance explained by the predictive models were approximately 63% for alr m and 68% for alr s . Figure 3 shows the observed versus predicted values for a random subset of the test samples for alr m and alr s . The plots show that there is considerable variation that was not explained by the models.   Of the three classification schemes applied the EUNIS Level 3 map was the most accurate with an overall accuracy of 77.5%, as opposed to 74.1% and 58.8% accuracy for Folk 5 and Folk 16 respectively ( Table 3). The three confusion matrices show how the class accuracy is highly variable between classes. For example, in the EUNIS Level 3 map 'Sand/muddy sand' was the most widespread sediment type and was the most accurately mapped, with a user's accuracy of 79.2%. For the same map the lowest classification accuracy was the class 'mixed sediments' (user's accuracy of 49.6%) which was also the least sampled class. By comparison, the user's accuracy values for the Folk 16 map were relatively low. Of the 15 sediment classes only four achieved a user's accuracy of >50%. Yet, because this included the three most sampled classes, 'muddy sand', 'sand' and 'sandy mud' which totalled 73.8% of the samples this contributed to the overall accuracy being 58.8%. This sampling bias towards sandy sediments was further highlighted by comparing the producer's and user's accuracies within each classification scheme. The producer's accuracy was higher than the user's accuracy for the most sampled class in all three classification schemes (i.e. 'sand/muddy sand' within EUNIS Level 3 and Folk 5, and 'sand' in Folk 16), but for all other classes the producer's accuracy was lower than the user's accuracy (with the exception of the Folk 16 classes 'gravelly sand' and slightly gravelly muddy sand'). Of the three classification schemes applied the EUNIS Level 3 map was the most accurate with an overall accuracy of 77.5%, as opposed to 74.1% and 58.8% accuracy for Folk 5 and Folk 16 respectively ( Table 3). The three confusion matrices show how the class accuracy is highly variable between classes. For example, in the EUNIS Level 3 map 'Sand/muddy sand' was the most widespread sediment type and was the most accurately mapped, with a user's accuracy of 79.2%. For the same map the lowest classification accuracy was the class 'mixed sediments' (user's accuracy of 49.6%) which was also the least sampled class. By comparison, the user's accuracy values for the Folk 16 map were relatively low. Of the 15 sediment classes only four achieved a user's accuracy of >50%. Yet, because this included the three most sampled classes, 'muddy sand', 'sand' and 'sandy mud' which totalled 73.8% of the samples this contributed to the overall accuracy being 58.8%. This sampling bias towards sandy sediments was further highlighted by comparing the producer's and user's accuracies within each classification scheme. The producer's accuracy was higher than the user's accuracy for the most sampled class in all three classification schemes (i.e., 'sand/muddy sand' within EUNIS Level 3 and Folk 5, and 'sand' in Folk 16), but for all other classes the producer's accuracy was lower than the user's accuracy (with the exception of the Folk 16 classes 'gravelly sand' and slightly gravelly muddy sand').  Comparison of the classified outputs with previous work from Stephens and Diesing [10] indicate a high level of agreement between the predictions. Comparing the EUNIS Level 3 maps, which were the most accurate, where the two studies shared a similar extent the overall agreement between the maps was 78.1% (Table 4). However, map agreement was not consistent between the classes. Of the points classed as 'sand/muddy sand' in the updated sediment map 81.1% were given the same classification in the Stephens and Diesing [10] map. This compares with only 1.8% agreement for points classed as 'mixed sediments', which was the least widespread class.

Sediment Composition
The predicted spatial distribution of the sediment fractions (mud, sand and gravel) are shown in Figures 4-6 alongside the local RMSE for each sediment fraction. Sand was the most widespread sediment type. The mud fraction was prevalent in deeper areas such as the Norwegian Trough and, to a lesser extent, intra-shelf basins. Areas of high gravel fraction were predicted in the English Channel and other areas that experience high current speeds. For each sediment fraction the error associated with that prediction varied spatially across the study site. For example, while the predicted fraction of sand was high across most of the study site, error was particularly concentrated around the Irish sea and near the coast ( Figure 5). The distribution of map error, as measured by local RMSE, is different for the three sediment fractions, however, they all indicate that the North Sea is an area of higher accuracy.

Sediment Composition
The predicted spatial distribution of the sediment fractions (mud, sand and gravel) are shown in Figure 4, Figure 5 and Figure 6 alongside the local RMSE for each sediment fraction. Sand was the most widespread sediment type. The mud fraction was prevalent in deeper areas such as the Norwegian Trough and, to a lesser extent, intra-shelf basins. Areas of high gravel fraction were predicted in the English Channel and other areas that experience high current speeds. For each sediment fraction the error associated with that prediction varied spatially across the study site. For example, while the predicted fraction of sand was high across most of the study site, error was particularly concentrated around the Irish sea and near the coast ( Figure 5). The distribution of map error, as measured by local RMSE, is different for the three sediment fractions, however, they all indicate that the North Sea is an area of higher accuracy.   The classified maps, presented in Figure 7, Figure 8 and Figure 9, simplify these fractions into three commonly used classification schemes. The EUNIS Level 3 and Folk 5 predictions are generally similar, with the only difference being that Mud/sandy mud is more extensive in the Folk 5 classified map. These differences are most evident in the Fladen Grounds off eastern Scotland, the Oyster Ground north of the Netherlands and the Irish Sea. However, there is variation in local accuracy between the two schemes, most notably around the Fladen Grounds where the Folk 5 map has higher accuracy. The Folk 16 map is more detailed, with sand, muddy sand and sandy mud the most widespread sediment classes. However, the increased specificity resulted in lower local accuracies around the majority of the study area.  The classified maps, presented in Figure 7, Figure 8 and Figure 9, simplify these fractions into three commonly used classification schemes. The EUNIS Level 3 and Folk 5 predictions are generally similar, with the only difference being that Mud/sandy mud is more extensive in the Folk 5 classified map. These differences are most evident in the Fladen Grounds off eastern Scotland, the Oyster Ground north of the Netherlands and the Irish Sea. However, there is variation in local accuracy between the two schemes, most notably around the Fladen Grounds where the Folk 5 map has higher accuracy. The Folk 16 map is more detailed, with sand, muddy sand and sandy mud the most widespread sediment classes. However, the increased specificity resulted in lower local accuracies around the majority of the study area. The classified maps, presented in Figures 7-9, simplify these fractions into three commonly used classification schemes. The EUNIS Level 3 and Folk 5 predictions are generally similar, with the only difference being that Mud/sandy mud is more extensive in the Folk 5 classified map. These differences are most evident in the Fladen Grounds off eastern Scotland, the Oyster Ground north of the Netherlands and the Irish Sea. However, there is variation in local accuracy between the two schemes, most notably around the Fladen Grounds where the Folk 5 map has higher accuracy. The Folk 16 map is more detailed, with sand, muddy sand and sandy mud the most widespread sediment classes. However, the increased specificity resulted in lower local accuracies around the majority of the study area.

Discussion
The maps produced in this study are to date the highest resolution (7.5 arc seconds) quantitative sediment maps that have been produced at the scale of a sea-basin. Previous studies by Stephens and Diesing [10] and Wilson et al. [17] generated sediment predictions at 500 m and 0.125° resolution respectively. Increased resolution was possible due to improvements in the resolution of predictive layers available such as the tidal currents TELEMAC2D model and bathymetry layers available through the EMODnet project. Not only did an increased resolution bathymetry layer result in more detailed derivative layers but also, peak orbital velocity of waves at the seabed improved as the formula to calculate this from peak wave height and period requires depth to be known for each pixel. The two most important variables in both the alrm and alrs models were mean tidal current velocity and peak orbital velocity of waves at the seabed (Figure 2), both of which were modelled from new data that had a finer resolution. The extent of the study was also increased compared to Stephens and Diesing [10], including areas of the continental shelf around Ireland, northern Scotland, the Norwegian Trough and the Skagerrak. Regional maps such as these avoid the inevitable artefacts that occur at the borders between different datasets, national boundaries or study area [4,35]. These are typically a result of maps derived from different datasets or under different methods. Where the response variable is categorical only, as is generally the case, map users have few options with how to dissolve these boundaries so as to reflect reality. However, continuous response variables like the mud, sand and gravel fractions produced using this method may provide a useful tool to resolve these border issues. For example, areas of overlap could inform some degree of calibration factor to apply to one dataset or the other.
Increasing the resolution resulted in an overall accuracy of 78% for the EUNIS Level 3 map, which is less than the accuracy of 83% Stephens and Diesing [10] reported for their equivalent model. The independent test data indicated that approximately 60% and 63% of the variability was explained for the alrm and alrs respectively, which is also less than the 66% and 71% explained by Stephens and Diesing's models. However, based on the number of sediment samples used by Stephens and Diesing [10] and the resolution of their study it appears that there may have been instances with multiple samples per pixel. This might have had the effect of artificially inflating the reported accuracy. As seen in Table 4, there was a high level of agreement between the two studies, but this was primarily

Discussion
The maps produced in this study are to date the highest resolution (7.5 arc seconds) quantitative sediment maps that have been produced at the scale of a sea-basin. Previous studies by Stephens and Diesing [10] and Wilson et al. [17] generated sediment predictions at 500 m and 0.125 • resolution respectively. Increased resolution was possible due to improvements in the resolution of predictive layers available such as the tidal currents TELEMAC2D model and bathymetry layers available through the EMODnet project. Not only did an increased resolution bathymetry layer result in more detailed derivative layers but also, peak orbital velocity of waves at the seabed improved as the formula to calculate this from peak wave height and period requires depth to be known for each pixel. The two most important variables in both the alr m and alr s models were mean tidal current velocity and peak orbital velocity of waves at the seabed (Figure 2), both of which were modelled from new data that had a finer resolution. The extent of the study was also increased compared to Stephens and Diesing [10], including areas of the continental shelf around Ireland, northern Scotland, the Norwegian Trough and the Skagerrak. Regional maps such as these avoid the inevitable artefacts that occur at the borders between different datasets, national boundaries or study area [4,35]. These are typically a result of maps derived from different datasets or under different methods. Where the response variable is categorical only, as is generally the case, map users have few options with how to dissolve these boundaries so as to reflect reality. However, continuous response variables like the mud, sand and gravel fractions produced using this method may provide a useful tool to resolve these border issues. For example, areas of overlap could inform some degree of calibration factor to apply to one dataset or the other.
Increasing the resolution resulted in an overall accuracy of 78% for the EUNIS Level 3 map, which is less than the accuracy of 83% Stephens and Diesing [10] reported for their equivalent model. The independent test data indicated that approximately 60% and 63% of the variability was explained for the alr m and alr s respectively, which is also less than the 66% and 71% explained by Stephens and Diesing's models. However, based on the number of sediment samples used by Stephens and Diesing [10] and the resolution of their study it appears that there may have been instances with multiple samples per pixel. This might have had the effect of artificially inflating the reported accuracy.
As seen in Table 4, there was a high level of agreement between the two studies, but this was primarily for the 'sand/muddy sand' class, and caution should be taken when interpreting the extent of the other classes, in particular the 'mixed sediments' class.
This study benefited from recent attempts to compile multiple sources into large datasets (e.g., [36,37]). However, some sources of data had limited or no metadata. Therefore, it was impossible to know which methods were used to measure the quantity of sediment components. For example, samples collected using a standardised methodology [38] would have provided the most suitable samples for this process. However, methods such as laser grain size analysis can produce differences compared with more traditional sieving techniques [39]. As a minimum requirement to be included, samples needed to have the quantities of mud, sand and gravel recorded and be collected post-1990. Further attempts were made to filter the data by removing samples that contained some commonly occurring rounded fractions (e.g., 25%, 50% and 100%). However, it is likely that some imprecisely measured samples were still retained. Further, other sources of groundtruth error such as locational error [40], changes to sediment type through time and differences in sampling gear may have also increased map error but are unknown without adequate metadata. This paper therefore highlights the value of analysing sediment samples using robust quantitative techniques and the need for adequate metadata to be recorded so data can be appropriately utilised in future studies. However, the mapped outputs are still of value, as they portray sediment composition quantitatively and with increased resolution. Further, as they are supported with spatially-explicit maps of error, the level of reliance on the maps can be varied depending on the local accuracy.
Should a certain level of generalisation be required we would suggest incorporating object-based image analysis [41] into the workflow. This was displayed in Diesing [12] where 'noisy' pixel-based predictions were generalised using a process of segmentation, into areas of homogenous attributes, and then averaging the prediction between pixels within that segment. While generalised maps may be simpler for end users to interpret, they may also conceal some of the variability of a prediction. 'Noisy' areas within a map, where neighbouring pixels have a high degree of variation, may be an accurate reflection of the environment (e.g., high heterogeneity) or may be an artefact of the model (e.g., missing predictor variables or an overfitted model). Therefore, model generalisation may not always be desirable. For example, sediment fractions have proven more valuable than classified maps for understanding biologically meaningful species assemblages [8], and individual fractions may be particularly valuable for understanding certain sediment gradients, such as the organic carbon stocks [13].
It has been demonstrated that bedrock outcropping at the seabed can be reliably predicted [42][43][44]. Incorporating such information into basin-scale substrate maps would be a desirable goal in the future. However, several challenges must be met to make this happen: To our knowledge, maps of predicted bedrock occurrence do only exist for the UK continental shelf. Also, the existing maps have a much higher resolution (25 m) than the sediment predictions presented here. Finally, a framework must be developed that allows for expressing map accuracy or confidence when predictions have been made in different ways.
Increasing amounts and types of measured, modelled and remotely-sensed data have become available from the EMODnet and Copernicus data portals. At the same time, methods for quantitative spatial prediction and spatially-explicit error assessment continue to evolve. For example, a generic framework for predictive modelling of spatial and spatio-temporal variables using random forest has recently been presented [45]. We expect that similar products as those showcased here will ultimately become available for other sea areas within Europe. These will likely be of use for research as well as various applications such as habitat suitability modelling, nature conservation and marine planning among others.