Acoustic Seafloor Classification Using the Weyl Transform of Multibeam Echosounder Backscatter Mosaic

The use of multibeam echosounder systems (MBES) for detailed seafloor mapping is increasing at a fast pace. Due to their design, enabling continuous high-density measurements and the coregistration of seafloor’s depth and reflectivity, MBES has become a fundamental instrument in the advancing field of acoustic seafloor classification (ASC). With these data becoming available, recent seafloor mapping research focuses on the interpretation of the hydroacoustic data and automated predictive modeling of seafloor composition. While a methodological consensus on which seafloor sediment classification algorithm and routine does not exist in the scientific community, it is expected that progress will occur through the refinement of each stage of the ASC pipeline: ranging from the data acquisition to the modeling phase. This research focuses on the stage of the feature extraction; the stage wherein the spatial variables used for the classification are, in this case, derived from the MBES backscatter data. This contribution explored the sediment classification potential of a textural feature based on the recently introduced Weyl transform of 300 kHz MBES backscatter imagery acquired over a nearshore study site in Belgian Waters. The goodness of the Weyl transform textural feature for seafloor sediment classification was assessed in terms of cluster separation of Folk’s sedimentological categories (4-class scheme). Class separation potential was quantified at multiple spatial scales by cluster silhouette coefficients. Weyl features derived from MBES backscatter data were found to exhibit superior thematic class separation compared to other well-established textural features, namely: (1) First-order Statistics, (2) Gray Level Co-occurrence Matrices (GLCM), (3) Wavelet Transform and (4) Local Binary Pattern (LBP). Finally, by employing a Random Forest (RF) categorical classifier, the value of the proposed textural feature for seafloor sediment mapping was confirmed in terms of global and by-class classification accuracies, highest for models based on the backscatter Weyl features. Further tests on different backscatter datasets and sediment classification schemes are required to further elucidate the use of the Weyl transform of MBES backscatter imagery in the context of seafloor mapping.


Introduction
Human pressures on the marine environment are rapidly increasing due to a multitude of concurrently developing economic sectors and interests [1]. At a European level, human activities in the marine environment are regulated by maritime spatial plans (MSP) implemented by each member state under the EU Maritime Spatial Planning Directive 2014/89/EU. The goal of the MSP is to accommodate the designation of space suitable for human activities, while also ensuring environmental protection via the exclusion and/or regulation of activities within, for example, Habitat Directive (HD) and/or Marine Protected Areas (MPAs) [2,3]. The detailed understanding of the type and extent of seafloor sediments, recognized as a keystone surrogate for marine life [4,5], underpins the effective implementation of an ecosystem-based management, including setting the baselines for environmental monitoring. The production of detailed seafloor sediment maps (referred to as benthic habitat maps, when including a biological component, and/or substrate maps, when reporting only abiotic information of the sediment), has drastically matured over the past decades, owing to significant technological developments of the remote sensing instrumentation such as multibeam echosounder systems (MBES) [6,7], as well as to drastic improvements in hydroacoustic data processing [8,9]. This has allowed marine scientists to access seafloor imagery approaching the level of detail encountered in the terrestrial remote sensing realm. As a result, this has enabled the development of equivalent "land cover classification" approaches for the underwater environment [3,[10][11][12].
Due to their design, enabling full-coverage and high-density measurements and recording both seafloor bathymetry and backscatter strength, MBES have become a central hydroacoustic remote sensing technology, routinely employed since the late 1980s for standard mapping operations (e.g., hydrographic surveying) [13]. Seafloor bathymetry (depth) and reflectivity (backscatter strength) are the primary data recorded by a MBES. Bathymetry, measured from echoes flight times and angles, is processed to provide continuous topographic information. Backscatter, retained until recently only as a sonar byproduct, is now recognized as a proxy of the seafloor nature (texture and composition), related to five fundamental physical quantities: (a) the acoustic impedance contrast (or hardness), (b) the seafloor roughness, (c) the water-sediment interface volume heterogeneity, (d) the acoustical frequency and (e) the sound pulse angle of incidence onto the seafloor [8]. Due to the scattering properties of different seafloor substrates and features, backscatter can help determine bottom type [14][15][16]. Put simply, using MBES backscatter for sediment characterization can be interpreted as the identification of "the characteristics and spatial organization of seafloor patches and/or signatures with common acoustic responses and the measurable characteristics of this response" [17].
The backscatter strength returning to the sonar receiver is dependent on geological conditions, sonar equipment, environmental factors, etc. Moreover, various third-party processing software always cannot provide enough details of the compensation algorithm applied in their products, which hinders the subsequent data processing work. To exploit backscatter data for image analysis, rigorous processing steps must be taken to radiometrically and geometrically reduce the data and different backscatter datatypes can be obtained [8,9,17]. Both signal and image-based backscatter can be used for seafloor sediment characterization. Signal-based methods exploit the inherent property of the backscatter angular dependence (i.e., the physical variation in backscatter intensity with angle of incidence onto the seafloor) and refer to Angular Range Analysis (ARA), identifying the characteristics of the angular backscatter response curve describing sediment type [18]. Image-based methods rely on the backscatter imagery (backscatter mosaic); removing the angular dependence via statistical compensation and normalized and referenced to a conventional incidence angle (or a limited range of angles), in such a way that the whole seafloor scene seems to be observed from one same incidence angle [8]. The backscatter imagery is gridded as a function of the bathymetric resolution and presented in a georeferenced frame, generally in the form of a grey-scale image displaying the continuum of geoacoustical facies [9]. This offers the potential to derive several textural and statistical features, for example, derived by geospatial neighborhood analysis based on the pixel values.
Indeed, complementing spatial imagery datasets with seafloor samples acquired from precise locations (i.e., ground-truth data), forms the basis of acoustic seafloor classification (ASC): the discipline that seeks to integrate and characterize the hydroacoustic (geophysical data and explanatory variables) and ground-truth data (response variable) in the interest of geology, sedimentology, and biology, ultimately producing maps providing a detailed understanding of the seafloor environment [19,20]. Acoustic seafloor classification has drastically evolved over the past decades, moving from primarily univariate and manual digitisation of acoustic images to multivariate, and automated classification approaches: machine learning supervised predictive modeling algorithms have particularly gained popularity due to their accuracy, repeatability, and ability to handle several features simultaneously [21][22][23][24][25][26][27][28]. As ASC methodological research advances, it is expected that progress will occur through the refinement of each stage of the ASC pipeline: ranging from the data acquisition to the modeling phase.
Amongst the various aspects of the ASC pipeline, the stage of feature extraction covers a particularly important role. Feature extraction based on backscatter imagery is an active field of research, with numerous approaches tested in recent years, ranging from texture analysis based on Gray Level Co-occurrence Matrices (GLCM-based on the original work of [29]), to statistical moments within neighbors [30], and the use of various filters or transformations of the data (e.g., Moran's I, Sobel filter, etc.) [21]. In the context of seafloor mapping, textural feature extraction can be traced back to the pioneering work of Pace et al. (1979) based on backscatter images recorded by Side Scan Sonars (SSS-a previous generation echosounding technology, specifically designed to acquire backscatter imagery [6]). The method proposed a quantification of the seafloor texture based on the original work of Haralick et al. (1973), stacking statistical measures computed from GLCM layers and the variance-to-mean ratio for the frequency of occurrence of backscatter grey levels [31]. This method achieved satisfactory classification results for the interpretation of both grossly flat sedimentary and geomorphologically complex seafloor environments. Entropy and homogeneity indices derived from backscatter GLCM layers have been found particularly useful for ASC as they closely relate to seafloor roughness and complexity and have been used to describe the seafloor texture in an array of environments [11,32]. Other feature extraction approaches for seafloor characterization tested on backscatter images include combinations of first-order statistics [33,34], the calculation of fractal dimensions [35], and power spectra [36]. Reut et al. (1985) put forward a frequency domain analysis method based on the Fourier Transform to retrieve sediment types from the SSS signal [37]. Filter-based feature-extraction methods have equally been proposed, such as the Gabor filter [38]. Denoting the growing interest for textural feature extraction for seafloor sediment characterization, a recent comparative study by Karoui et al. (2009) demonstrated how GLCM features outperformed the Gabor filter in seafloor segmentation of sonar textures [39]. While comparative studies assessing the performance of classification algorithms are advancing, there exists a paucity of studies quantitatively comparing the classification performance (expressed in terms of thematic accuracy and thematic cluster separation) of different features derived from MBES backscatter data [3].
To address the paucity of comparative studies, this research focuses on the aspect of feature extraction; the stage wherein the spatial variables used for the classification (explanatory spatial information) are, in this case, derived from the MBES backscatter imagery. Specifically, this investigation focuses on texture-based feature extraction methods. We introduce a novel image-based feature extraction approach for the classification of the seafloor sediment type based on the Weyl Transform, that was recently proved effective in other domains including medical image analysis [40,41]. The Weyl transform, defined as a mapping between a signal and its autocorrelation coefficients, has the desirable property of being scale and orientation invariant. Consequently, patches sampled from the same texture will share similar Weyl coefficients [40,42]. On this basis, we develop an effective and robust MBES backscatter feature extraction method which, compared to well-established techniques, can significantly enhance thematic cluster separation. We use this approach to characterize the MBES backscatter imagery of a high-frequency survey of a shallow (10 m) coastal area of the Belgian Part of the North Sea (BPNS). We test the performance of the Weyl transform and its category distinguishability potential against four Folk (1954) sedimentological classes. The main contributions of this work are: (1) A novel Weyl transform based approach to MBES backscatter characterization and its comprehensive evaluation; (2) Analyzing the effects of scale on cluster consistency, and the selection of an optimal scale for feature extraction in image-based ASC with different feature extraction methods; (3) Thorough quantitative evaluation of the Weyl features on real MBES data in combination with common classifiers.

Materials and Methods
This section presents the acquisition and processing of the geophysical MBES and ground-truth datasets. The textural descriptor based on the Weyl transform and the statistical predictive modeling approaches are detailed. The study is based on MBES data acquired in November 2017 in the nearshore coastal area of the Belgian part of the North Sea (BPNS), off the Oostende Harbour. The location of the surveying area is shown in Figure 1. The survey covers around 11 km 2 and encompasses a designated site for the dumping of dredged material (i.e., harbour and channel maintenance) [43].

Multibeam Data Acquisition and Processing
The MBES data were acquired using a hull-mounted Kongsberg Maritime EM2040 Dual system installed on RV Simon Stevin (campaign 17-660-https://www.vliz.be/en/ rv-simon-stevin (accessed on 8 April 2020)) and were logged at a nominal frequency of 300 kHz, in normal mode, CW pulse form and 101 µs pulse length. The survey was designed with at least 20% overlaps between adjacent track-lines. To produce a sedimentologically meaningful image, raw MBES data were processed in several stages as described in [44], including noise reduction to reduce random amplitude fluctuations, geometrical artefact corrections, and angular dependence compensation to produce a compensated backscatter image. This type of backscatter product is coded as "A4 B0 C0 D0 E5 F0" according to the nomenclature proposed in [17]. Bathymetry and backscatter data were gridded with a horizontal resolution of 1 m (

Ground-Truth Data Acquisition and Processing
The ground-truth data were collected by means of cylindrical box-core samples. This type of sampling gear conserves the surficial composition and configuration of the seafloor while conserving the subsurface stratigraphy, providing qualitative geotechnical appreciations of the nature of the sediment in the study area. Using the ternary sediment classification of Folk (1954), samples were labeled as Sand (S), Sandy mud (sM), Muddy sand (mS), and Gravelly sand (gS) as shown in Figure 3. Sediment categories were determined by a combination of particle size analysis (by means of a Malvern Mastersizer 3000) and expert visual observation. Overall, the ground-truth dataset consists of 19 samples and their locations are displayed using colored circular markers in Figure 2. Figure 4 shows the probability density distribution for the backscatter and depth variation in the study area and the adopted sample datasets (Figure 4a,c). The probability density value is normalized between 0 and 1. The selected sample datasets adopted in the experiments have a similar density distribution both in backscatter intensity and depth, indicating that those samples represent the whole area well. The boxplot clearly shows that the samples from each type of sediment tend to have a particular backscatter region and depth zone (Figure 4b,d). "Training" and "test" refer to the distributions of the training and validation sample datasets used in the classification.

Seafloor Characterization Based on the Weyl Transform
Within the framework of ASC, MBES data are integrated with ground-truth data. As MBES backscatter data relates to seafloor surficial structure and composition, texture is regarded as one of the most efficacious features in image-based sediment mapping research. When dealing with the texture image classification tasks, we are interested in representations that are invariant under a given group of transformations, such as translation, scaling, and rotation. The Weyl transform has recently shown remarkable results in the context of texture classification with standard texture images [40], outperforming some common textural descriptors including Histograms of Oriented Gradients (HOG) [45] and Local Binary Pattern (LBP) [46]. HOG and LBP are very popular methods for texture description in computer vision, such as human detection and biomedical recognition problems. The Weyl transform is a mapping between a vectorized signal and its autocorrelation coefficients, which can capture multiscale symmetries of the texture. The transform has the desirable property of being invariant to a large class of multiscale signed permutations. In particular, different ways of orienting and translating the same texture will produce the same Weyl descriptor, and patches sampled from the same texture should share similar Weyl transforms.

The Weyl Representation
The Weyl transform distinguishes different textural structures by quantifying multiscale symmetry features [41]. Moreover, invariance to multiscale transformations ensures that image patches with the same textural structures result in similar Weyl representation. The binary Heisenberg-Weyl group HW 2 m is a group of permutation matrices and matrices that resemble permutation matrices with sign changes in some of the rows. Those square matrices of size 2 m exist for each power of 2 and are defined as Kronecker products: where and a = (a 0 , . . . , a m−1 ), b = (b 0 , . . . , b m−1 ) ∈ Z m 2 are two binary m-tuples. Formally, the binary Heisenberg-Weyl group HW 2 m of order 2 2m+2 is defined as HW 2 m = {i λ D(a, b) | λ ∈ {0, 1, 2, 3} and a, b ∈ Z m 2 }. As shown in [40], the signed permutation matrices D(a, b) with a T b = 0 form an orthonormal basis of the vector space of real square symmetric matrices with respect to the inner product given by R, S := tr (R T S). In particular, each real symmetric matrix R can be represented as a linear combination of the basis elements as Given a vectorized signal y ∈ R 2 m , its covariance matrix yy T ∈ R 2 m ×2 m is a real symmetric matrix and as such can be represented as Coefficients ω a,b (y) are the Weyl coefficients of the signal y and the corresponding isometric mapping yy T → ω a,b (y) is the Weyl transform [40]. Each vector y can be indexed by a binary m-tuple v = (v m−1 . . . v 0 ) T . The covariance matrix yy T can be regarded as a combination of multiple autocorrelation bands and an autocorrelation band of y can be represented as Let ω a ∈ R 2 m represent a subset of Weyl coefficients labeled by a m-tuple a, defined by (ω a ) b := ω a,b . Given binary m-tuples v and w, Walsh-Hadamard transform matrix can be defined as This implies that the Weyl transform can be seen as the Walsh-Hadamard transform of binary autocorrelations.
The absolute values of the Weyl transform coefficients are invariant under conjugation by any element of the binary Weyl-Heisenberg group D a,b as well as by left or right multiplication by D a,b . Moreover, the values are invariant under conjugation by a larger group of permutations corresponding to elements of the binary symplectic group Sp(2m, Z).

Texture Descriptor
MBES backscatter data record quantitatively acoustic intensity scattered to the underwater transducer from the seafloor. Through a series of data processing stages, the backscatter strengths are processed into sedimentologically meaningful relative backscatter level, which subsequently serves as the source data to be mapped into image pixels using an appropriate grid resolution and a reasonable resampling strategy [9]. We hypothesize that autocorrelation, and especially the dyadic autocorrelation framework from Section 2.3.1, can provide good discrimination between greyscale patterns corresponding to different sediment types. Hence, we design here a texture descriptor based on the Weyl transform to represent the local pattern of the backscatter mosaic as concomitant characteristics relating to seafloor physical property.
We divide the multibeam backscatter image into relatively small patches using a moving window of size S w × S w (S w = 2 r , r ∈ Z + ) (the effect of the patch size is analyzed in Section 2.5), with a sliding step of S t . Each patch can be vectorized in a raster-scanning fashion which results in a S = 2 2r dimensional vector. Let m = 2r, a = (a m−1 . . . a 0 ) T and b = (b m−1 . . . b 0 ) T . Then, the Weyl coefficients of patch Y are computed by using Equation (3). Figure 5 shows how we obtain the Weyl representation of a selected patch from a multibeam backscatter image. The complete procedure of the Weyl coefficients computation is summarized in Algorithm 1.

Input:
A multibeam backscatter image, sampling size S w , sliding step S t Procedures: 1: Extract an image patch from the whole image using a moving window of size S w × S w .

Cluster Consistency Comparison with Classical Texture Features
In an ideal classification problem scenario, an optimal textural descriptor should maximize the interclass variability while minimizing the intraclass variability (the same principle applies to the goal of the classification itself; the textural feature enhances this possibility). To evaluate this process in a simplified scenario, 19 blocks were identified within the backscatter image of the surveyed area, being centered with box-core samples and including a very limited area around their neighborhoods (with an area of 256 m × 256 m). Figure 6 shows four backscatter blocks centered with different sediment types. Based on prior knowledge and data analysis, the block size is determined by manual interpretation. We assumed that each block displays a representative and consistent backscatter image texture. We made another assumption that patches from a block belonged to the same class of sediment (these two assumptions are referred to as neighborhood consistency). Then, the locations selected were of a known sediment type, from the ground-truth points. It must be clarified that the relatively small number of ground-truth datapoints available in this investigation may pose a risk to the neighborhood consistency assumptions put forward. Therefore, randomness in image block selection is counteracted by repeating experiments, whereas autocorrelation is treated by introducing a spatial blocking strategy in the image sampling process. For each of the four sediment categories, we randomly sampled 500 8 × 8 patches from the backscatter image in the blocks, with a total of 2000 samples. In turn, these samples were used to derive (a) first-order statistics (FOS), (b) GLCM features, (c) wavelet coefficients (WLT), (d) LBP features and (e) Weyl coefficients (WT). First-order statistics contained five features derived from greyscale imagery, including maximum, minimum, mean, variance, and mode. Statistical features derived from GLCM included contrast, correlation, entropy, homogeneity, and angular second moment. Wavelet coefficients were extracted by performing a level one decomposition using Daubechies wavelet (db2), containing mean and standard deviation of one approximation coefficient and three detail coefficients. LBP encoded local texture structures as normalized histograms using 1 norm. The Weyl coefficients were calculated using Algorithm 1.
In order to quantitatively state the cohesion and separation of the sample clusters in each feature space, the silhouette index (SI) was introduced to access the data consistency within clusters [47]. The silhouette value is a measure of how similar an object is to its own cluster (cohesion) compared to other clusters (separation). The silhouette index is calculated using the mean intracluster distance and the mean nearest-cluster distance (Euclidean distance) for each sample. The silhouette validation technique calculates the silhouette index for each sample, average silhouette index for each cluster and overall average silhouette index for a dataset. After the dimension reduction, a series of x, y, z cartesian plots will aid the visualization of the cluster discrimination potential and separation for those feature extraction methods above at multiple scales. SI computation and all feature extraction methods were implemented using built-in functions in MATLAB R2020b (64-bit) on a computer with an Intel Core TM i7-7700HQ CPU.
(a) Gravelly sand (b) Muddy sand (c) Sand (d) Sandy mud Figure 6. Identified backscatter blocks of four sediment types.

Scale Selection
In this research, we evaluated the effect of the scale of the Weyl coefficients on MBES backscatter imagery so that the relative results can aid the selection of an optimal scale factor to carry out and improve the performance of image-based ASC. To appraise the effect of scale on the sediment type discrimination potential, the following patch sizes were selected for neighborhood analysis: 4 × 4, 8 × 8, 16 × 16, 32 × 32 pixels. For each selected scale and for each of the five feature derivation methods, 2000 image patches (for each sediment type, 500 patches sampled in the same way as the cluster consistency test) were obtained and plotted in a reduced dimensional space obtained by means of Principle Component Analysis (PCA).

Statistical Modeling and Protocol of Error Estimation
To validate the performance of the proposed texture descriptor based on the Weyl transform, we employed an Random Forest (RF) as a classifier for the purpose of quantitative comparison [48]. We implemented the routine in Python 3.8 (64-bit) using the module of sklearn.ensemble. In the training process of the model, parameters were set as follows: the number of trees in the forest was 100 (more trees did not significantly improve the model performance in the test), the number of features to consider when looking for the best split was the square root of the number of all features, Gini impurity was used to measure the quality of a split, bootstrap samples were used when building trees, and the other parameters were left as default.
In general, seafloor data are spatially autocorrelated, which means that those observations close to each other are more similar than distant ones. To avoid the problem of spatial autocorrelation in the sampling data, we adopted a spatial blocking strategy for the allocation of samples to training set and test set. In each identified block of the whole backscatter image, we split the backscatter data spatially into several subblocks. The checkerboard pattern is adopted for the allocation of subblocks to two different folds. This strategy can avoid adjacent subblocks in a fold, which means that all samples of the test set are excluded from the training set [49]. For each subblock, we took 8 × 8 patches by overlapping sampling with a sliding step of 4 pixels, with 32,400 samples being available in total both in training set and test set. In the experiments, we randomly took 2000 samples for every class of the sediment from the dataset, including backscatter patches and their corresponding labels. The training set contained 200 samples and the test set contained the other 1800 of the samples. To quantitatively validate the performance of the proposed texture descriptor for acoustic sediment classification, we performed sediment classification on the test set by feeding the feature variables to a trained RF model using the training set. In the results, each test patch was assigned to a sediment category by a majority voting from the various individual trees in the RF model.
The protocol of error estimation was based on derivation of accuracy metrics from the confusion matrix [50]. The confusion matrix cross tabulates predicted and observed instances providing information on the correct allocation of class labels (along the matrix diagonal) and of class labels confusion between categories (over the offdiagonal entries). The derived metrics included: Overall Accuracy (A o ) (defined by the number of overall correctly allocated instances divided from the overall number of validation samples), Cohen's Kappa (K) (also referred to as "chance agreement"), Recall Rate (R) (the true positive rate) and the OOB RF error (ER) (the error rate estimated by about 36.8% of samples withheld from the training dataset). Figure 7 shows the x, y, z scatter plots for the various features to aid the visualization, and each feature space is projected into three dimensions using PCA. Datapoints are colorcoded according to the sediment category: gS (blue), mS (black), sM (green), S (red). Of the traditional features in the same scale of 8 × 8, the "First-order Statistics" method ( Figure 7b) produced the sharpest cluster separation, with two isolated clusters (blue and green) and one overlapping area (between black and red). The approaches "GLCM" (Figure 7f) and "Wavelet Transform" (Figure 7j) produced moderate separations, both with two overlaps in the outer area of the clusters. "LBP" (Figure 7n) is not able to generate separated clusters for these four classes of samples. Figure 7r shows the potential of the Weyl transform compared to the other set of features (Figure 7b,f,j,n), which presents tight aggregation and clear boundary. To enable a quantitative assessment, the silhouette indexes were calculated for the samples in all feature spaces (Table 1): the Weyl coefficients obtained 0.4803, which was the highest in five features, while LBP gave the lowest value 0.0578. The other three methods provided moderate performance: 0.4386, 0.3880, 0.3373 respectively for first-order statistics, GLCM, and Wavelet transform.

Multiple Scale Analysis
The cluster separation potential resulting from the overall five feature derivation approaches is shown in Figure 7. These results clearly show how cluster separation varies as a function of spatial scale. It can be generally stated that increasing the scale increased the cluster separation for all approaches herein tested.
In the traditional methods, with a sampling size of smaller than 8 × 8, samples represented by first-order statistics, GLCM features, and wavelet coefficients can cluster together for each class, but interclass distance was quite near and some of the clusters were partly overlapping in the outer edge. Samples represented using LBP features were mixed for all four classes. Only when the sampling size was larger than 16 × 16, those traditional methods can show satisfied discriminability, with slight or no overlap areas. Regarding the Weyl coefficients, Figure 7q-t), with increasing the sampling scale, the clusters become more compact, and the overlap areas between clusters become smaller. Intermediate values of scale (i.e., 8 × 8) were sufficient to isolate distinct clusters, and all the values of scale larger than 8 × 8 produced sharp cluster separation. Table 1 quantitatively shows the tendency of the sample distributions by different features and scales using silhouette values of the resulting clusters. We can see that silhouette value increases as the scale increasing in each row of the table. In the column of 4 × 4, first-order statistics obtained the highest silhouette index, while the Weyl transform achieved the highest value in the scales of 8 × 8, 16 × 16 and 32 × 32.

Quantitative Comparison of Classification Accuracy
We compared the performance of the five feature derivation approaches and the detailed classification results using different approaches are reported in Tables 2-6. Firstorder statistics, GLCM features, and wavelet coefficients all demonstrated a superior classification accuracy for Muddy sand and Sandy mud as compared with Sand and Gravelly sand. The overall accuracies of these three methods were at the same level, which are respectively 70.90%, 70.43%, and 71.63%. LBP produced the poorest recall rates for all classes and the lowest overall accuracy. The Weyl transform achieved the best overall classification accuracy. To be specific, higher classification accuracies were obtained for Sand, Muddy sand, and Sandy mud as compared with Gravelly sand. It should be noted that recall rates of all classes in the Weyl transform were above 61%, which none of the other approaches achieved. In order to avoid the biased classification results by randomness, we present the averaged performance of each method for recognizing four classes of sediment in Table 7 after running the test 10 times. The recall rates of each class are presented in the first four columns in Table 7, and the mean overall accuracy of the test is presented in the last column. The overall accuracy scores show that the Weyl coefficients outperformed both single feature extraction methods (A o reaching 76.85%). The Weyl transform achieved both a high overall accuracy as well as high by-class accuracy scores for the classes Sand, Muddy sand, and Sandy mud. Especially, the Weyl transform significantly improves the classification accuracy of the class Sand, which otherwise produced the lowest accuracy scores in the other tested feature derivation approaches (Figure 9).

Discussion
In the context of predictive modeling for acoustic seafloor classification, feature derivation enhances the characteristics of the primary data (i.e., its variability), ultimately improving the classification problem. The goal of the derived feature is that of maximising within-class homogeneity and between-class variability. The derivation of secondary features is particularly appealing where the classification task requires the fitting of a prescribed classification scheme over acoustic data with limited sediment discrimination potential, as for the Folk classes herein tested with 300 kHz MBES backscatter data (with respect to the acoustical operating frequency) [51]. In the projected 3-D feature space, compactness and overlap (i.e., consistency) of the clusters (quantified by the associated metrics, i.e., silhouette coefficients) are retained as key indicators for the evaluation of a feature's classification performance. Recognizing the importance of feature extraction in acoustic seafloor classification, this experimental investigation proposed the Weyl transform as a novel texture-based descriptor of MBES compensated backscatter imagery. This feature enhances thematic cluster separation in terms of Folk (1954) sedimentological categories. Experiments were dedicated to both visually and quantitatively comparing and evaluating thematic cluster consistency across five reduced feature derivation approaches (herein referred to as "traditional features" and dimensionally reduced by PCA) and across various scales of derivation (i.e., neighborhood/window sizes). Furthermore, comparative analyses were set up to evaluate the thematic predictive accuracy of random forest models based on the five disparate sets of secondary features. Together, the experimental results converged showing that based on the backscatter dataset herein used, the texture-based Weyl transform outperformed the other secondary features traditionally used in seafloor sediment classification.

Multiscale Cluster Consistency
Reduced dimensional space plots (Figure 7), produced for multiple window scales (i.e., 4 × 4, 8 × 8, 16 × 16 and 32 × 32), allowed observing different behaviors of the tested approaches. Visually, inspecting the cluster compactness in the 3D space revealed that LBP, first-order statistics, GLCM, and wavelet transform approaches all resulted in substantially overlapping clusters, across all scales assessed for, hence leading to misclassification in the subsequent stage and hindering the prediction of the categories in geographical space (see Section 4.2 in the discussion). In contrast, the Weyl transform produced the sharpest clusters, showing clear separation of the four prescribed categories at all scales except for the smallest windows sizes (4 × 4). Noticeably, across all approaches tested, the multiscale cluster consistency analysis (summarized in Table 1 by means of Silhouette coefficients), indicated that cluster separation increased with increasing window size, reaching the highest separation scores for the 32 × 32 window size. While this suggests a potentially well-performing classification using these features at such a scale, it would come at the cost of losing spatial information, contrasting to the objective of high-resolution seafloor sediment mapping. This behavior, and subsequent limitation, is attributed to the window size: until a size of 32 × 32, it is unable to produce a histogram representative of the statistical distribution of the grey levels for that given thematic class. The Weyl transform produced the highest silhouette scores, outperforming all other approaches in terms of compactness and separation for the representation of these sediment categories. The multiscale cluster consistency analysis identified 8 × 8 as the optimal scale for the Weyl transform. This indicates that the Weyl transform requires less information (a smaller sampling size) than the traditional approaches tested. However, it is important to note that this may be a dataset-specific result, requiring further testing on different backscatter and ground-truth datasets. The optimal scale for analysis closely relates to the practical geological conditions and it would constantly vary with the characteristics of the seafloor. The fact that the Weyl transform coefficients achieve high separability performance with a smaller sampling size is indeed a desirable result.

Classification Accuracy
Once the optimal scale and the desirable properties of the Weyl transform were appraised, the investigation looked at the thematic classification accuracy of random forest models built on the different feature derivation approaches. Here, the Weyl transform achieved the best performance in terms of overall classification accuracy. First-order statistics, GLCM features, and wavelet coefficients scored similar overall accuracies, ranking second to the Weyl transform. LBP showed the lowest performance. Classification accuracy was also appreciated based on by-class accuracies, indicating that compared to other models, the Weyl transform improves the prediction of three out of four types of sediment (Sand, Muddy sand, and Sandy mud). This mainly results from the invariance to a large class of multiscale dihedral transformations. A collection of signed permutation matrices provides good capability for describing textures, with the high-order terms giving coarse-scale permutations and the low-order terms giving fine-scale permutations. The Weyl transform can capture multiscale symmetries that are important in identifying textural structures from acoustical images and associate those with sediment categories. This ensures that the Weyl coefficients of the backscatter image patches obtained from the same sediment type exhibit similarity [40]. Only the category Gravelly sand achieved lower accuracies. The classification accuracy of all classes may be improved by enlarging the sampling window size. However, this would inevitably represent a trade-off: decreasing resolution to increase accuracy. Indeed, several methodological comparative studies, including this investigation, suggest that, to date, there exists no single methodological routine that can consistently achieve the best classification performance, and hybrid approaches, such as ensemble and combinatory maps, integrating into one product the best performing parts of multiple models, has been proposed as a possible way forward [3,25,28]. Similarly, with respect to feature extraction, one possible way forward is that of developing deep feature fusion methods: an approach by which various features are recombined to reach their fullest discrimination potential capabilities [52][53][54]. Recent marine habitat mapping studies by  confirm that for a dataset with an overlapping distribution of backscatter intensities between classes, secondary features derived from MBES data can substantially contribute to classification prediction accuracy [55]. Therefore, this suggests that rather than relying on an ensemble map, refinements at the ASC pipeline stage of the feature extraction could largely benefit classification performance.

Conclusions
In this investigation, the Weyl transform was introduced as a novel texture-based descriptor of MBES backscatter imagery, specifically as an aid to its classification into sediment categories that are at the center of attention of European seafloor sediment mapping harmonisation efforts (i.e., https://www.emodnet-geology.eu/map-viewer/?p= seabed_substrate [56] (accessed on 10 December 2020). The texture descriptor based on the Weyl coefficients describes effectively the multiscale correlation features resulting from MBES backscatter images. The importance of multiscale assessments to avoid the arbitrary selection of the scale parameter was confirmed. Overall and by-class classification accuracy metrics demonstrated that the Weyl transform can improve model accuracy based on the tested dataset and under the spatial assumptions of neighborhood consistency put forward in the methodology. Besides the favorable characteristics of the Weyl transform, no single feature could achieve the best predictive classification performance for all the sediment categories. Feature fusion, similar to hybrid mapping methods merging the outputs of disparate classifiers, shall be explored as a further way of improving classification. Further research is needed to elucidate the physical relationship between scale, Weyl transform, and sediment classes, including testing the method on different backscatter and ground-truth datasets and habitat classification schemes.