The Use of Unmanned Aerial Systems to Map Intertidal Sediment

: This paper describes a new methodology to map intertidal sediment using a commercially available unmanned aerial system (UAS). A ﬁxed-wing UAS was ﬂown with both thermal and multispectral cameras over three study sites comprising of sandy and muddy areas. Thermal signatures of sediment type were not observable in the recorded data and therefore only the multispectral results were used in the sediment classiﬁcation. The multispectral camera consisted of a Red–Green–Blue (RGB) camera and four multispectral sensors covering the green, red, red edge and near-infrared bands. Statistically signiﬁcant correlations (>99%) were noted between the multispectral reﬂectance and both moisture content and median grain size. The best correlation against median grain size was found with the near-infrared band. Three classiﬁcation methodologies were tested to split the intertidal area into sand and mud: k-means clustering, artiﬁcial neural networks, and the random forest approach. Classiﬁcation methodologies were tested with nine input subsets of the available data channels, including transforming the RGB colorspace to the Hue–Saturation–Value (HSV) colorspace. The classiﬁcation approach that gave the best performance, based on the j -index, was when an artiﬁcial neural network was utilized with near-infrared reﬂectance and HSV color as input data. Classiﬁcation performance ranged from good to excellent, with values of Youden’s j -index ranging from 0.6 to 0.97 depending on ﬂight date and site.


Introduction
A key aspect of both coastal research and environmental impact assessment for coastal development is the mapping of intertidal sediment type. This paper describes the use of a commercially available unmanned aerial system (UAS) to conduct such mapping. The focus of this work is on sands and muds. A key motivation for the work was the potential development of a tidal energy lagoon industry where altered intertidal coverage of sand and mud is perceived to be the primary environmental impact by both regulators and developers [1]. Gravels and cobbles are less important in this context: such sediment classes make up a much lower percentage of the intertidal in the areas of interest so are not considered a key receptor compared to sands and muds, which provide important benthic habitats. Additionally, the contrast between cobbles and sand is greater and hence they are considered less difficult to identify in remotely sensed data; the authors have previously demonstrated such distinction using terrestrial laser scanners [2].
Intertidal sediment type is important for two main reasons: benthic habitats and coastal morphodynamics. Grain size dictates the makeup of benthic communities (e.g., [3][4][5][6]); even temporally varying fine-scale sedimentological variations such as sedimentological differences between peaks and troughs of mega ripples can force differences in community structure [7]. Therefore, anthropogenically forced variations in grain size can affect community structure [8,9]. From a morphodynamic perspective, grain size and cohesive properties not only dictate thresholds of erosion and deposition [10,11]-beach profile is related to grain size, as well e.g., [12,13]. In an estuarine context, mud levels can influence long-term morphological development [14]. Episodic coastal mud events have been shown to have impacts on both hydrodynamics and morphodynamics, as well as having societal implications [15].
Alteration of wave exposure and tidal currents from the pre-existing baseline, whether due to coastal development, climate change or other factors, can lead to changes to baseline sediment coverage [16,17]. For example, an area of increased shelter may lead to increased deposition of muds or, vice versa, increased currents may erode finer sediment. To capture any changes, pre-and post-construction monitoring is desirable, particularly for large-scale developments. Seasonal variation in sediment supply or wave and tidal forcing mean that in some areas seasonal changes in sediment coverage may be observable (e.g., [18]) and hence regular monitoring may be required.
Direct sediment sampling and lab-based analysis is the default methodology for determining grain size and sediment type. For many estuarine areas and areas of high tidal range, the width of the intertidal expanse means that such in-situ monitoring is time consuming. This means mapping is based on a sparse grid of samples that may not accurately represent detailed 2-dimensional spatial patterns of different sediment types. Additionally, some intertidal areas are inaccessible by foot and there are health and safety considerations related to working in unconsolidated sediment and close to the low tide line.
Previously, researchers have demonstrated the use of satellite and aircraft remote sensing to map intertidal sediment with good success. Multispectral and hyperspectral instruments attached to light aircraft have been used to map clay content and intertidal grain size distributions [19]; to map percentages of sand, clay and silt and hence classify sediment type [20] or to classify intertidal areas into different classes [21][22][23]. Similarly, a range of satellites has multispectral or hyperspectral sensors that can be used to map intertidal sediment type [24,25] or estimate grain size [26]. These techniques have all been shown to perform well and facilitate classification of large areas of intertidal sediment in a time effective manner. However, the cost of instrumentation and deployment; the large quantities of data; and the specialist knowledge required for processing and interpretation mean they may not be suitable for routine environmental impact assessment work. In such cases, commercial consultancy companies may be more likely to use off-the-shelf tools to acquire data and UAS remote sensing may be more attractive.
In recent years, the use of UAS has proliferated in both academia and industry to facilitate a range of surveys. Typically, this focuses on the use of Red-Green-Blue (RGB) color images to reconstruct digital surface maps, (DSMs), through structure from motion (SfM) techniques. Industrial uptake means a range of off-the-shelf products are available including aircraft, sensors and pre-and post-processing software. One area that has embraced the use of UAS is precision agriculture, which means that relatively low-cost multispectral sensors are available. These are optimized for plant health monitoring and typically include spectral bands in the visible, red-edge (RE), and near-infrared (NIR), regions of the electromagnetic spectrum. Equally, thermal cameras are readily available for tasks such as inspection of solar panels. Past research has shown that thermal signals can be used to classify sediments: sandy sediments have a much stronger response to heating than muddy sediments [27]. The difference is caused by differences in sediment composition and porosity [28].
Here, both thermal and an agriculturally focused multispectral sensor were tested at three sites ( Figure 1) on a small fixed wing UAS. Variation of the UAS measured parameters over the intertidal and their relationships with grain size and moisture content is presented. Next, different sets of the measured data channels were passed to three different classification routines. These routines were used to define different sediment type regions; a binary discrimination between "sand" and "mud" was considered. The three classification routines tested were: k-means clustering; artificial neural networks (ANNs) and the random forest (RF) approach. The k-means technique was selected as it provides an example of an unsupervised technique which would allow efficient automation of the process and has previously been applied to sediment classification of intertidal areas [29]; ANNs were utilized because the authors have previously had success with ANN for sediment discrimination using terrestrial laser scanner data [2]; while the RF technique has been reported as a useful tool for classification of sediment type in remotely sensed data [30]. The objective of the study is to demonstrate the potential of low-cost, off-the-shelf UAS to map intertidal sediment type. The motivation is to facilitate the uptake of this new technology for both commercial and research applications.

Materials and Methods
The methodology is split between the study sites, field methodology and classification methodologies. Within the field methodology section is a description of the flight process, the labbased sediment analysis and the initial post processing of the flight data. The classification methodology section describes the three techniques used and the metric used to compare them.

Flight Methodology
Flights were conducted with both the senseFly ebee (Neath Estuary flights, see Section 2.1.3) and senseFly ebee Plus drones (additional sites). Flight details are given in Table 1. Both drones are fixed wing, powered by a single electric motor and propeller and are launched by hand. The Ebee is 0.7 kg in weight with a wingspan of 0.96 m whilst the Ebee Plus is 1.1 kg and has a wingspan of 1.1 m. In terms of nominal coverage, at 120 m the Ebee can cover 140 ha and the Ebee Plus 220 ha in one flight. The k-means technique was selected as it provides an example of an unsupervised technique which would allow efficient automation of the process and has previously been applied to sediment classification of intertidal areas [29]; ANNs were utilized because the authors have previously had success with ANN for sediment discrimination using terrestrial laser scanner data [2]; while the RF technique has been reported as a useful tool for classification of sediment type in remotely sensed data [30]. The objective of the study is to demonstrate the potential of low-cost, off-the-shelf UAS to map intertidal sediment type. The motivation is to facilitate the uptake of this new technology for both commercial and research applications.

Materials and Methods
The methodology is split between the study sites, field methodology and classification methodologies. Within the field methodology section is a description of the flight process, the lab-based sediment analysis and the initial post processing of the flight data. The classification methodology section describes the three techniques used and the metric used to compare them.

Flight Methodology
Flights were conducted with both the senseFly ebee (Neath Estuary flights, see Section 2.1.3) and senseFly ebee Plus drones (additional sites). Flight details are given in Table 1. Both drones are fixed wing, powered by a single electric motor and propeller and are launched by hand. The Ebee is 0.7 kg in weight with a wingspan of 0.96 m whilst the Ebee Plus is 1.1 kg and has a wingspan of 1.1 m. In terms of nominal coverage, at 120 m the Ebee can cover 140 ha and the Ebee Plus 220 ha in one flight. Two different sensors were used: a Parrot Sequoia multispectral sensor and a senseFly thermoMAP thermal camera (both purchased in the UK via Korec group). The Sequoia camera includes a rolling shutter RGB camera and four multispectral sensors covering the green (530-570 nm), red (640-680 nm), red edge (730-740 nm) and near-infrared (770-810 nm) bands. An upward facing sunshine sensor with the same four spectral bands allows for self-calibration of the reflectance values. Additional calibration was conducted prior to each flight using a reflective target.
The thermoMAP camera, which was only used at the three flights at the Neath Estuary, measured thermal infrared radiation in the range 7.5-13.5 µm which corresponds to a temperature range of −40 to 160 • C with a resolution of 0.1 • C.
Flights were planned using the senseFly eMotion 3 software. The surveyed area was kept the same for both sensors. Flight area was defined based on area of interest, consideration of obstructions, maximum flight time and the 500 m permitted working radius as specified by the UK Civil Aviation Authority. The size of the area of interest was reduced for the Neath site after the first flight. The flight software calculated the flight path based on specification of a required ground pixel resolution and expected wind speed. Ground pixel resolution varied between 6-9 cm for the multispectral sensors and was set at 10 cm for the thermal cameras. Both lateral and longitudinal overlap was set to 75% for all flights. Where both sensors were flown, the multispectral sensor was always flown first; the rationale being to ensure UAS measured temperature was as temporally close to point temperature measurements as possible. The same software was used for flight operation. Flights were fully automated and conducted on autopilot from take-off to landing. The thermoMAP camera was flown in time lapse mode (continual photo recording) whereas the Sequioa was flown in photo mode. All flights were conducted on the outgoing tide as close to low tide as possible (Table 1).
Prior to flying, ground control points (GCPs) were set out and their position surveyed using RTK-GPS to enable accurate geolocation. GCPs were created from black and white chequered lino that was cut into squares with two white and two black quadrants. Each quadrant had an edge length of 24 cm. To facilitate identification in the thermal imagery, the white sections were covered in aluminum foil, increasing the thermal contrast. It was aimed that GCPs would be evenly distributed about the study area ( Figure 2); however accessibility of the intertidal limited this. Note that not all GCPs are shown for the first flight, which covered a wider area than subsequently analyzed. Number of GCPs per 100 photos ranged from 1.8 (Appledore) to 5.9 (Neath flight on 03/05/2018); this means that all flights have a reasonable number of GCPs [32].

Additional Measurements
To support analysis of the drone measured data, point measurements of sediment temperature and moisture content were taken. These were taken immediately after completion of both flights. The positions of points were measured with a Topcon HiPer HR RTK GPS. A Fisher Scientific Traceable flipstick thermometer with 0.1 • C resolution and 0.3 • C accuracy was used for the surface temperature measurements. A Delta-T ML3 ThetaKit was used to measure surface moisture. This device had an accuracy of ±1% and was chosen for its wide temperature and salinity operating ranges.
To test the classification performance, a classification of the sediment into "sand" or "mud" was made at specific points; this classification was made based on granularity and color of the surface sediment. For a sub sample of these, sediment samples were collected and processed to ground-truth the visual classification (see Section 2.1.3). Sediment surface samples were collected in the field using a stainless-steel combination auger to a 5-cm depth. In the laboratory, samples were oven-dried at 40 • C until dry before clays were disaggregated using a ceramic pestle and mortar. Sediments were passed through a 1-mm stainless sieve to remove stones, shells and larger pieces of organic matter and the sediment mass of the <1-mm grain-size fraction recorded. For particle-size determination analysis, the dry sediment samples were fully homogenized before sub-samples weighing 6 g were taken. These samples were chemically treated to remove organic matter. Particle size was determined in a Beckmann Coulter LS230 instrument that utilizes laser diffraction and measures particle sizes from 0.04 to 2000 µm in a wet module and to ISO 13320:2009 standards. All samples were analyzed using an automatic measurement mode using a Standard Operating Procedure (SOP), created prior to analysis. Samples were calibrated according to Thermo Scientific™ National Institute of Standards and Technology, (NIST), traceable size-standards of 15-µm, 50-µm and 200-µm nominal diameters. Values of median grain size were used to classify samples based on the Wentworth-Udden Scale. Results from the sediment analysis are presented in the study site description rather than results.

Study Sites
Three UK study sites were used in this investigation: the Neath Estuary (South Wales) flown three times to develop the optimum classification methodology and then two further sites flown to test the wider applicability of the methodology. These two additional sites were at Appledore (North Devon) and Llansteffan (South Wales). The site locations are shown in Figure 1. Orthomosaics of drone imagery for all three sites are displayed in Figure 2 and panoramic photos in Figure 3.
an average d50 of 294 µm; these areas had minimal mud content with less than 2% volume below 63 µm (the sand threshold on the Wentworth-Udden scale). Areas defined as "mud" had a d50 varying between 97 µm (very fine sand) and 158 µm (fine sand), however the percentage of mud increased to between 14-38%. During the time period of the test flights there was significant beach sediment recycling activities taking place just to the south of the study area. Visually this seems to increase the sand areas in the section to the north east of the small creek.
The second study location was at Appledore in North Devon at a site locally known as "The Skern". This site was bounded by cobble and sand dunes to the north, west and south and by the estuary on the east. The site showed less three-dimensionality than the Neath Estuary site but there were still some drainage creeks present, particularly in the lower intertidal. The upper intertidal comprised of sand and well-established saltmarsh with muddy areas and pioneer saltmarsh in the middle intertidal. The lower intertidal was more spatially varying due to the presence of drainage creeks and varied between sand, mud, and muddy sand.  The Neath Estuary site is located where the Neath river meets the open sea. It is comprised of a central ridge of sand dunes with salt marsh and muddy creeks to the north and open sand beach to the south. The site is bounded to the east by the estuary. The topography of this site is highly three-dimensional caused by the presence of the Neath River and small creeks draining the muddy area in the lee of the sand dunes. Areas defined as 'sand' in the visual classification were medium sand with an average d50 of 294 µm; these areas had minimal mud content with less than 2% volume below 63 µm (the sand threshold on the Wentworth-Udden scale). Areas defined as "mud" had a d50 varying between 97 µm (very fine sand) and 158 µm (fine sand), however the percentage of mud increased to between 14-38%. During the time period of the test flights there was significant beach sediment recycling activities taking place just to the south of the study area. Visually this seems to increase the sand areas in the section to the north east of the small creek.
The second study location was at Appledore in North Devon at a site locally known as "The Skern". This site was bounded by cobble and sand dunes to the north, west and south and by the estuary on the east. The site showed less three-dimensionality than the Neath Estuary site but there were still some drainage creeks present, particularly in the lower intertidal. The upper intertidal comprised of sand and well-established saltmarsh with muddy areas and pioneer saltmarsh in the middle intertidal. The lower intertidal was more spatially varying due to the presence of drainage creeks and varied between sand, mud, and muddy sand. Visual samples defined as "sand" were fine sand on the Wentworth-Udden classification with d50 values between 185-227 µm. Points visually classified as 'mud' ranged from medium to coarse silt (31-39 µm).
The final site is at Llansteffan on the River Towy in South Wales. Whereas the other sites were at the seaward edge of estuaries, this site was further upstream and more two-dimensional. It consisted of a sandy upper intertidal and a muddier lower intertidal, with gradual gradation between the two across the intertidal profile. Correspondingly d50 reduced across the profile from 177 µm (fine sand) to 94 µm (very fine sand). Measured sediment samples that were visually classified as sand had less than 3% volume mud while the samples visually classified as mud had volumes between 16-39% of mud-sized particles.

Post-Processing of UAS Data
Prior to detailed analysis, the UAS collected data were post processed in four stages; firstly, flight logs and images were imported and georeferenced using the eMotion 3 software; secondly, the Pix4D software was used to generate color orthomosaics and index maps of multispectral reflectance; thirdly, GeoTIFF images created by Pix4D were resampled onto the same 0.1 m grid; finally, all data was screened to remove non-sediment areas.
The first stage is automated and requires minimal user intervention. The second stage of this process is more complex and includes initial processing; refinement of geo-location through marking of GCPs; followed by densification of the point cloud and generation of orthomosiacs and reflectance maps. The SfM approach used in Pix4D is described by [33]: the initial keypoint extraction and matching is based on binary descriptors [34]; a block bundle adjustment based on [35,36] is then conducted to determine internal and external camera parameters. Within the Pix4D software is a range of templates to optimize processing: for this study the ThermoMAP camera template was used for thermal imaging flights and the Ag Multispectral and Ag RGB templates used for imagery from the Sequioa. At this stage in the process, manual identification of the ground control points in the images was conducted using Pix4D's ray cloud editor. Average root mean square error (RMSE) in GCP location varied between 0.026 m and 0.27 m, which was deemed acceptable for this work. Subsequently the point cloud is densified and then DSMs, orthomosaics, and reflectance maps created. In the third stage, the GeoTIFF images were interpolated onto one 0.1 m resolution grid covering the area of interest. At this point in the process careful visual inspection was conducted to asses that there were no odd features in the generated maps and that geolocation of all maps matched.
The fourth stage was the screening process, which removed open water, vegetation, and supratidal areas so that the areas left were predominantly bare intertidal sediment. Open water pixels were removed using the normalized difference water index (NDWI) [37]. This index is calculated as: where X green is the reflectance in the green band and X NIR is the reflectance in the near infra-red band. Positive values of this index are considered open water and thus are removed Vegetation was removed using the normalized difference vegetation index NDVI (Equation (2)): Values over 0.3 were considered to be vegetation and removed following [38]. In Equation (2), X red is the reflectance in the red band. A DSM created in Pix4d was used to remove supratidal areas based on the level of highest astronomical tide at the nearest national tide gauge network station. An example of this screening process is given in Figure 4.

Classification Methodology
One unsupervised and two supervised classification routines are tested here; unsupervised routines classify the data a priori based on characteristics of the dataset whereas supervised routines require some form of user input for training, typically definition of pre-classified subsets of the data. The unsupervised method used is k-means++ cluster-based classification [39] and the two supervised routines are based on artificial neural networks (ANNs) and on random forests [40]. All methods make use of built in MATLAB routines. Only data from the Sequioa sensor was fed to the classification routines, (see the reasoning in Section 3.1).
The RGB data was also transformed to the HSV colorspace: past research has suggested HSV may be better than RGB for image classification because there is less covariance in the HSV colorspace [41][42][43][44][45]. This gives 10 available data channels and the performance of 9 sets of these were considered. The 9 sets tested were: RGB; HSV; the multispectral channels (MS); RGB and MS; HSV and MS; RGB, red edge (RE) and near-infrared (NIR); HSV, RE and NIR; RGB and NIR; HSV and NIR. Required input for each technique was an n by m matrix where n is the number of samples and m is the number of data channels (between 3-7 dependent on set of data). Therefore, gridded data was transformed to vectors to conduct the analysis and then the output transformed to gridded classification maps.
The k-means algorithm [46] is a process that seeks to partition a set of n datapoints χ into k subsets, or clusters, based on minimizing the mean distance of points in a cluster to the centroid of that cluster. The k-means++ algorithm is a modification of the original k-means algorithm to improve the initial seeding process [39] by seeking starting centroids at data points that are relatively distant

Classification Methodology
One unsupervised and two supervised classification routines are tested here; unsupervised routines classify the data a priori based on characteristics of the dataset whereas supervised routines require some form of user input for training, typically definition of pre-classified subsets of the data. The unsupervised method used is k-means++ cluster-based classification [39] and the two supervised routines are based on artificial neural networks (ANNs) and on random forests [40]. All methods make use of built in MATLAB routines. Only data from the Sequioa sensor was fed to the classification routines, (see the reasoning in Section 3.1).
The RGB data was also transformed to the HSV colorspace: past research has suggested HSV may be better than RGB for image classification because there is less covariance in the HSV colorspace [41][42][43][44][45]. This gives 10 available data channels and the performance of 9 sets of these were considered. The 9 sets tested were: RGB; HSV; the multispectral channels (MS); RGB and MS; HSV and MS; RGB, red edge (RE) and near-infrared (NIR); HSV, RE and NIR; RGB and NIR; HSV and NIR. Required input for each technique was an n by m matrix where n is the number of samples and m is the number of data channels (between 3-7 dependent on set of data). Therefore, gridded data was transformed to vectors to conduct the analysis and then the output transformed to gridded classification maps.
The k-means algorithm [46] is a process that seeks to partition a set of n datapoints χ into k subsets, or clusters, based on minimizing the mean distance of points in a cluster to the centroid of that cluster. The k-means++ algorithm is a modification of the original k-means algorithm to improve the initial seeding process [39] by seeking starting centroids at data points that are relatively distant from existing centroids. Ref. [39] show that this approach both speeds up computational time and improves cluster definition. Since there is no supervision, the clusters of data are defined based purely on the data itself. It was found that while setting k to 2 did routinely split between sand and mud, data screening prior to classification was required (Section 2.1.4).
For the supervised classifications, subsets of the data where the sediment type was known was used to train the classification technique before the technique was applied to the whole area of interest.
The first supervised classification technique tested was an ANN approach. A two-layer feed forward ANN was used [47]. The hidden layer has a sigmoid transfer function which enables a probability to be assigned to each classification. Sensitivity testing with different numbers of hidden neurons showed no statistically significant difference in classification performance and therefore the default value of 10 neurons were specified for the hidden layer. The number of neurons in the output layer is equal to the number of classes of sediment being discriminated between. The network was trained for weight and bias of the connections between neurons using the scaled conjugate gradient method [48]. Output is a k by n vector where k is the number of sediment classes and n is the number of data points, populated with probabilities that a specific point is a specific sediment class. Every data point was assigned the class that had the highest probability for each point.
The second supervised approach used was the random forests (RF) approach [40]. This approach uses a bootstrap aggregated ensemble of decision trees that avoids the likelihood of overfitting when using a single decision tree. Thirty trees were used in the forest: sensitivity testing with between 10-320 trees showed no statistical difference in classification performance.
To compare the different results, comparison was made between the visually classified field samples and points extracted from the classification maps. Confusion matrices were used which give clear insight into the performance of a classification technique; the structure is given in Table 2. Confusion matrices provide intuitive description of how well a methodology is performing; however, in order to rank the different methodologies, Youden's index is used [49]. This is calculated as: where j is Youden's index and the other quantities are as Table 1. The j-index has values ranging from −1 to 1 with 1 indicating a perfect classification.

Variation of Measured Parameters over the Intertidal
Temperature and moisture were measured at a range of locations for the three flights at the Neath Estuary sites. Figure 5 shows the variation of these parameters. The left-hand panel shows a comparison of UAS measured temperature against point measured temperature; it demonstrates that the two are strongly related (r = 0.96; significant at 99% level) and that UAS-derived temperatures reliably reproduce intertidal temperature in this case. The central panel shows a comparison between point-measured moisture content and point-measured temperature: for the first flight there is a positive correlation (r = 0.59), significant at the 95% level; there is no correlation for the second flight; for the third flight there is a strong negative correlation (r = −0.82) also significant at the 95% level.
This indicates the variability in the influence of moisture content on temperature with other environmental parameters. There is negligible clustering of the different sediment types. The righthand panel shows variation of point-measured moisture content with point-measured elevation: for all flights there is a negative correlation that is statistically significant at the 95% level. This is unsurprising since points lower on the profile will have had less time to dewater as the tide recedes. While all points with lower moisture contents are sand, there are no mud points higher up the intertidal profile and lower on the profile sand and mud have similar moisture contents.  To further explore temperature variation, Figure 6 shows maps of temperature for the three flights. For all panels, color shading indicates UAS-measured temperature and black contours the morphology. The upper panel shows the first flight (30/01/2018), due to low temperatures, there was insufficient thermal contrast to enable orthorectification of the entire area of interest. There is a sharp break in temperature in the sand portion of the study area; further consideration showed this. matched the level of the previous high tide (red line in image). The second and third flights both had sufficient thermal contrast to enable the entire area to be orthorectified. For the second flight (16/02/2018) the temperature variation seemed related to angle of slope of the morphology, with northward facing slopes being cooler than southward facing slopes (angled towards the sun). The third flight (03/05/2018), when temperatures were higher and when the sun was more directly overhead showed more uniform variation. Temperatures were lower at lower elevations. As shown in Figure 5 there is a strong negative correlation with moisture content. No clear links between temperature and sediment type were observed for any of the flights. Therefore, results from the thermal camera were not included in the classification analysis and it is believed that thermal cameras are unsuitable for year-round monitoring of sediment type in a Northern European context. To further explore temperature variation, Figure 6 shows maps of temperature for the three flights. For all panels, color shading indicates UAS-measured temperature and black contours the morphology. The upper panel shows the first flight (30/01/2018), due to low temperatures, there was insufficient thermal contrast to enable orthorectification of the entire area of interest. There is a sharp break in temperature in the sand portion of the study area; further consideration showed this. matched the level of the previous high tide (red line in image). The second and third flights both had sufficient thermal contrast to enable the entire area to be orthorectified. For the second flight (16/02/2018) the temperature variation seemed related to angle of slope of the morphology, with northward facing slopes being cooler than southward facing slopes (angled towards the sun). The third flight (03/05/2018), when temperatures were higher and when the sun was more directly overhead showed more uniform variation. Temperatures were lower at lower elevations. As shown in Figure 5 there is a strong negative correlation with moisture content. No clear links between temperature and sediment type were observed for any of the flights. Therefore, results from the thermal camera were not included in the classification analysis and it is believed that thermal cameras are unsuitable for year-round monitoring of sediment type in a Northern European context.  To further explore temperature variation, Figure 6 shows maps of temperature for the three flights. For all panels, color shading indicates UAS-measured temperature and black contours the morphology. The upper panel shows the first flight (30/01/2018), due to low temperatures, there was insufficient thermal contrast to enable orthorectification of the entire area of interest. There is a sharp break in temperature in the sand portion of the study area; further consideration showed this. Consideration was given to the comparison between multispectral reflectance and both moisture content ( Figure 7) and measured grain size ( Figure 8); where the reflectance is presented as a percentage (the reflected intensity divided by the incident intensity multiplied by 100). Based on Beer's law, that grain size and moisture content are exponentially related to intensity [50], the natural log of reflectance is plotted against the tested properties. Faulty calibration meant the red edge reflectance was incorrectly scaled for the flight on 30/01. Therefore, it is ignored for this part of the analysis; since the error was a scaling factor but the shape of variation was correct, it was still utilized in the classification testing (see Section 3.2). Surface moisture content was only available from the three Neath flights; data from all flights is combined in Figure 7. There is a negative correlation for all spectral bands; while r 2 values are low (values marked on figure), p values indicate that the correlation is highly significant (>99% level).
The comparison between median grain size and natural log of the spectral reflectance is shown in Figure 8. Data from all flights and sites are combined in this plot. For all multispectral channels, there is a positive trend to the relationship. Correlations are significant at the 99% level for the green, red and NIR channels (values marked on Figure 8); while the red edge correlation is only significant at the 95% level. Similar to the comparison against surface moisture content; while statistically significant, r 2 values are low which indicates the range of parameters affecting reflectance. The statistically significant correlation between grain size and NIR reflectance raises the question as to whether UAS measured reflectance could be used to directly map grain size. Consideration was given to the comparison between multispectral reflectance and both moisture content ( Figure 7) and measured grain size ( Figure 8); where the reflectance is presented as a percentage (the reflected intensity divided by the incident intensity multiplied by 100). Based on Beer's law, that grain size and moisture content are exponentially related to intensity [50], the natural log of reflectance is plotted against the tested properties. Faulty calibration meant the red edge reflectance was incorrectly scaled for the flight on 30/01. Therefore, it is ignored for this part of the analysis; since the error was a scaling factor but the shape of variation was correct, it was still utilized in the classification testing (see Section 3.2). Surface moisture content was only available from the three Neath flights; data from all flights is combined in Figure 7. There is a negative correlation for all spectral bands; while r 2 values are low (values marked on figure), p values indicate that the correlation is highly significant (>99% level).
The comparison between median grain size and natural log of the spectral reflectance is shown in Figure 8. Data from all flights and sites are combined in this plot. For all multispectral channels, there is a positive trend to the relationship. Correlations are significant at the 99% level for the green, red and NIR channels (values marked on Figure 8); while the red edge correlation is only significant at the 95% level. Similar to the comparison against surface moisture content; while statistically significant, r 2 values are low which indicates the range of parameters affecting reflectance. The statistically significant correlation between grain size and NIR reflectance raises the question as to whether UAS measured reflectance could be used to directly map grain size.  where XNIR(%) is the percentage NIR reflectance. This was used to estimate grain size and estimated values are plotted against measured grain size in Figure 9. The equation of the line of best fit is displayed on this figure; it approaches a 1:1 fit suggesting there is potential in the approach. However, there is a large amount of scatter in the data, as indicated by the r 2 value of 0.38. Therefore, this approach is taken no further in this paper; although some comment is made in the discussion (Section 4).

Neath Estuary Classification
The classification routines and sets of input data were tested and ranked based on Youden's index (Section 2.2). While reference is made to mud, dependent on site, this may refer to muddy sand (see Section 2.1.3). The best two sets of data and their j values for each flight and classification routine is shown in Table 3 and classification maps for the optimum set of data for each routine in Figure 10. In general, classification is performing well with optimum j values between 0.65 and 0.99. Table 3 shows that there is a range of sets of data channels giving the optimum results. Apart from the where X NIR(%) is the percentage NIR reflectance. This was used to estimate grain size and estimated values are plotted against measured grain size in Figure 9. The equation of the line of best fit is displayed on this figure; it approaches a 1:1 fit suggesting there is potential in the approach. However, there is a large amount of scatter in the data, as indicated by the r 2 value of 0.38. Therefore, this approach is taken no further in this paper; although some comment is made in the discussion (Section 4).

Neath Estuary Classification
The classification routines and sets of input data were tested and ranked based on Youden's index (Section 2.2). While reference is made to mud, dependent on site, this may refer to muddy sand (see Section 2.1.3). The best two sets of data and their j values for each flight and classification routine is shown in Table 3 and classification maps for the optimum set of data for each routine in Figure 10. In general, classification is performing well with optimum j values between 0.65 and 0.99. Table 3 Remote Sens. 2018, 10, 1918 14 of 23 shows that there is a range of sets of data channels giving the optimum results. Apart from the random forest classification on the 30/01/2018 (which had joint optimum between HSV alone and HSV + NIR), the optimum classification makes use of some or all of the multispectral sensors.   Statistically, based on standard error [36], there is no difference in performance between the different flights, classification methodologies or rank for the optimum methodologies (Table 4). Considering the average j values for each classification shows that in general the best performing is the artificial neural networks (average j = 0.88); the average j value for the k-means classification was 0.84 and the average value for the random forest classification was 0.82. Figure 10    Statistically, based on standard error [36], there is no difference in performance between the different flights, classification methodologies or rank for the optimum methodologies (Table 4). Considering the average j values for each classification shows that in general the best performing is the artificial neural networks (average j = 0.88); the average j value for the k-means classification was 0.84 and the average value for the random forest classification was 0.82. Figure 10 presents classification maps for the optimum set of data channels for each methodology and flight. For flight one (30/01), all spatial maps are similar which is reflected in the very similar j values. There are obvious errors in both the k-means and RF classifications for the second flight with areas of mud around the creek flanks classified as sand. The ANN approach appears to perform much better for this flight. For the third flight, again all classification techniques give very similar classification maps. This is unsurprising since all optimum sets have the same j value (0.81).
Given the statistical similarity between the different classification schemes and sets of input data (Table 4); a point-based ranking was used to establish which set of input data was most likely to give the best results. Using the data in Table 3, from all flights and classifications, 2 points were assigned every time a set of input data was the optimum classification and 1 point when the set of input data was the second-best performing classification. These points were then summed for the 9 sets of input data and the results are shown in Table 5. By far the highest-ranking set is the combination of HSV and the NIR channel. This is perhaps unsurprising as the NIR channel had the best correlation with grain size (Figure 8). Based on this information, and the fact that on average ANN classification performed best, the optimum methodology was defined as an ANN classification using HSV and NIR as input data.  Table 4. j-index and the upper and lower 95% confidence intervals for all flights and classifications. The statistical similarity is shown by values of the j-index falling within the confidence intervals of other classifications.  Given the statistical similarity between the different classification schemes and sets of input data (Table 4); a point-based ranking was used to establish which set of input data was most likely to give the best results. Using the data in Table 3, from all flights and classifications, 2 points were assigned every time a set of input data was the optimum classification and 1 point when the set of input data was the second-best performing classification. These points were then summed for the 9 sets of input data and the results are shown in Table 5. By far the highest-ranking set is the combination of HSV and the NIR channel. This is perhaps unsurprising as the NIR channel had the best correlation with grain size (Figure 8). Based on this information, and the fact that on average ANN classification performed best, the optimum methodology was defined as an ANN classification using HSV and NIR as input data.

Set of Input
Confusion matrices for this approach for the three Neath flights are shown in Tables 6-8 and classifications are visually represented in Figure 11. For flight one, all the mud points were correctly predicted as mud and only two sand points were falsely classified as mud; this gave a value for j = 0.97 which indicates an excellent classification. There was greater misclassification for the second flight giving a value of j = 0.6, but the classification can still be considered reasonable. Of the 23 mud points 8 were misclassified as sand and one of the 48 sand points was misclassified as mud. The areas of mud classified as sand can be seen in Figure 11e. The third flight once again provided a good classification with a value of j = 0.81. All mud points were correctly classified and of the 48 sand points, 39 were correctly classified.

Discussion
This paper has demonstrated the ability to distinguish between sandy and muddy intertidal sediments using a UAS with RGB and multispectral sensors. In this analysis, the optimal classification made use of HSV color and NIR reflectance data. Other research [33][34][35][36][37] has demonstrated that HSV is better than RGB for image classification and these results corroborate those findings. The benefits of including NIR reflectance is unsurprising since a positive correlation between median grain size and NIR reflectance was found.

Tests at Alternative Sites
The methodology provided good results at both additional sites (j = 0.64 at Appledore and j = 0.71 at Llansteffan). Confusion matrices are shown in Tables 9 and 10 and visual representation in Figures 12  and 13. At Appledore these is a similar amount of misclassification for both class of sediment. Areas where mud has dried out are sometimes classified as sand and some wet sand point classified as mud. At Llansteffan all areas defined as mud were correctly classified, but some sand areas were classified as mud. Fast moving variable cloud cover was present at the time of this flight which is evident in the orthomosaic and may help explain some misclassification.   The positive correlation between grain size and multispectral reflectance has been noted by other researchers using satellite remote sensing [26] and this relationship has been used to map grain size. Future work will conduct more flights and analyze more point sediment samples to widen the grain size parameter space and endeavor to develop similar tools for use with UAS sensors.
Despite the statistically significant positive correlation between multispectral reflectance and values of d50, there was a large amount of scatter in the data. A more statistically likely relationship  The positive correlation between grain size and multispectral reflectance has been noted by other researchers using satellite remote sensing [26] and this relationship has been used to map grain size. Future work will conduct more flights and analyze more point sediment samples to widen the grain size parameter space and endeavor to develop similar tools for use with UAS sensors.
Despite the statistically significant positive correlation between multispectral reflectance and values of d50, there was a large amount of scatter in the data. A more statistically likely relationship

Discussion
This paper has demonstrated the ability to distinguish between sandy and muddy intertidal sediments using a UAS with RGB and multispectral sensors. In this analysis, the optimal classification made use of HSV color and NIR reflectance data. Other research [33][34][35][36][37] has demonstrated that HSV is better than RGB for image classification and these results corroborate those findings. The benefits of including NIR reflectance is unsurprising since a positive correlation between median grain size and NIR reflectance was found.
The positive correlation between grain size and multispectral reflectance has been noted by other researchers using satellite remote sensing [26] and this relationship has been used to map grain size. Future work will conduct more flights and analyze more point sediment samples to widen the grain size parameter space and endeavor to develop similar tools for use with UAS sensors.
Despite the statistically significant positive correlation between multispectral reflectance and values of d 50 , there was a large amount of scatter in the data. A more statistically likely relationship was noted between multispectral reflectance and surface moisture content. Surface moisture and grain size are related because areas of coarser sediment will typically dewater faster and hence have lower surface moisture values than areas of finer sediment. Over the intertidal area, this is complicated by variation in dewatering affected by elevation and tidal inundation, as well as spatially varying groundwater influences. It is possible that by including elevation maps and time of flight from previous high tide, some correction could be made for this complicating factor. Future work is planned to address this aspect.
Very similar results were found for all sets of data and classification routines. This is partially due to the data screening prior to classification where both vegetation and open water were removed. The motivation for applying this screening process was that without the screening the unsupervised k-means routine failed to split between sediment types, instead regularly creating one class that was predominantly vegetation and one class predominantly bare sediment. Increasing number of classes did not improve classifications. Therefore, it was deemed a fairer test if only bare sediment was passed to the classification routines. Constraining the classification problem in this way was fast, automatable and transparent (relying on well know indices); therefore, it is not seen as a limitation of the study. Should classification of these regions be of interest, a multi-stage classification could be used to classify (rather than remove) regions of vegetation and water using the method described prior to the sediment classification.
The technique could easily be applied to other broad scale classes such as bedrock and gravel and vegetated intertidal areas can be identified using the NDVI. Future work will consider splitting into more detailed sediment classes. Results from the Neath Estuary where the "mud" portion was actually a mud-sand mixture suggest that there are sufficient differences in UAS measured parameters that this should be feasible. Initial tests suggested that an unsupervised classification of greater than two sediment types did not work well. Therefore, it is likely that performance of classification into multiple classes will depend on ease of defining suitable training datasets.
The multispectral and RGB sensor used, the Sequoia, has a global shutter for the four multispectral sensors but a rolling shutter for the RGB sensor. A rolling shutter leads to inaccuracies in the images when used on a moving platform such as a drone because the sensor will move in between lines of pixels being exposed by the shutter. While the processing software, Pix4D, has a correction algorithm for rolling shutter cameras and errors are minimized by use of GCPs, some errors will remain. In this study, careful comparison of RGB, multispectral and thermal orthomosaics were undertaken to ensure accurate coreferencing away from GCPs and no obvious errors were noted. Therefore, it is believed that the orthomosaics are suitable accurate for the purposes of the study. The DSMs are less likely to be accurate and the product literature cautions against relying on DSMs from the rolling shutter RGB. In this study elevation is only used to screen the supratidal area and little error in this application was noted. Other similar sensors, such as the MicaSense RedEdge-M, are available with global shutter RGB cameras, which would remove this uncertainty. Alternatively, a subsequent flight with a global shutter RGB camera would provide more accurate color information.
Remote sensing of intertidal sediment has several benefits over ground-based sampling, typically undertaken on foot. Most importantly, from a logistical perspective, is the reduction in health and safety risk associated with working in unconsolidated sediment close to the low tide line; additionally, time in the field can be significantly reduced which may reduce survey cost. Spatial resolution is much higher which avoids errors in classification caused by interpolation between a sparse network of samples.
Deployment of UAS fit well within the context of other remote sensing options such as airborne or satellite remote sensing. The technique is much lower cost than methods using manned aircraft which facilitates multiple repeat surveys, as required to capture seasonal change or the rapid adjustment to new coastal developments. The higher pixel resolution, compared to both satellite and manned aircraft solutions, mean smaller scale change could be identified. Additionally, the commercial availability of monitoring solutions is attractive and would make uptake by researchers, industry, governmental and nongovernmental agencies easier. However, the set-up of commercially available sensors is relatively inflexible. Past research has shown that reflectance in the short and medium wave infrared bands is more useful for sediment classification, and that hyperspectral sensors can accurately map geological properties of rock, e.g., [51]. Recent studies have mounted hyperspectral cameras to drones for a range of purposes [52,53]. Such approaches would likely provide better results; however, the sensors are expensive [52] and flight duration of drones suitable to mount the cameras is low [52,53], reducing extent of coverage.
Previous studies identified thermal infrared camera as having good potential for identifying regions of sand and mud [28]. Thermal cameras did not perform well in this study, however, which highlights the importance of other environmental factors in such discrimination. Areas where there is greater or more uniform solar heating may well show better success.
All UAS use is constrained by technical and regulatory factors. UAS regulations are defined by national aviation authorities. In the UK, regulations mean UAS must be flown within visual line of sight which limits distance of the UAS to 500 m from the operator; this distance can be extended by application for additional permissions, but they are not always granted. Another key regulatory factor is the 150 m separation distance from "congested areas" which include any area regularly used for commercial, industrial or recreational purposes. Technical factors include takeoff and landing requirements and weather limitations. Fixed wing UAS such as used in this study can cover larger areas than an octocopter-type UAS, but require suitable landing areas such flat grass. In this study, sand areas of the beach were typically used, however this did constrain choice of suitable test sites. Octocopters can take off and land vertically and hence landing areas are not a restriction on site selection. In addition, weather conditions including wind speed, visibility and heavy rain must be considered. An analysis of 10 years of weather data from a station at Mumbles close to the Neath Estuary site showed that conditions were only suitable for flying 40% of the time.

Conclusions
This contribution explores the use of UASs to map intertidal sediment type. Flights were conducted with both a thermal infrared camera and a multispectral camera that measured red, green, red edge and near-infrared reflectance as well as RGB color. Use of the thermal camera for sediment mapping was discounted in a northern European context because preliminary study showed that other environmental factors controlled the temperature variation over the intertidal rather than sediment type. Relationships between both color and multispectral reflectance against surface moisture were identified. Importantly, there was also a positive correlation between median grain size and multispectral reflectance, with the best correlation using the near-infrared reflectance (r = 0.6, 99% significance).
Three classification routines (k-means, artificial neural networks and random forests) were tested with nine sets of UAS measured data to broadly discriminate between sand and mud in the intertidal. Prior to classification, the orthomosaics were screened for vegetation and open water, which constrained the problem. Inclusion of multispectral channels improved results over classifications just using color data. The optimum combination was an artificial neural network approach using HSV color and NIR reflectance, where over the 5 flights Youden's j-index varied between 0.6 and 0.97, (where j = 1 corresponds to perfect classification). This approach was deemed very successful and it shows that UASs are a suitable tool for remote sensing of intertidal sediment.