Spatial Prediction of Coastal Bathymetry Based on Multispectral Satellite Imagery and Multibeam Data

The coastal shallow water zone can be a challenging and costly environment in which to acquire bathymetry and other oceanographic data using traditional survey methods. Much of the coastal shallow water zone worldwide remains unmapped using recent techniques and is, therefore, poorly understood. Optical satellite imagery is proving to be a useful tool in predicting water depth in coastal zones, particularly in conjunction with other standard datasets, though its quality and accuracy remains largely unconstrained. A common challenge in any prediction study is to choose a small but representative group of predictors, one of which can be determined as the best. In this respect, exploratory analyses are used to guide the make-up of this group, where we choose to compare a basic non-spatial model versus four spatial alternatives, each catering for a variety of spatial effects. Using one instance of RapidEye satellite imagery, we show that all four spatial models show better adjustments than the non-spatial model in the water depth predictions, with the best predictor yielding a correlation coefficient of actual versus predicted at 0.985. All five predictors also factor in the influence of bottom type in explaining water depth variation. However, the prediction ranges are too large to be used in high accuracy bathymetry products such as navigation charts; nevertheless, they are considered beneficial in a variety of other OPEN ACCESS Remote Sens. 2015, 7 13783 applications in sensitive disciplines such as environmental monitoring, seabed mapping, or coastal zone management.


Introduction
The term, "remote sensing" encompasses a series of well established procedures for bathymetric surveys of the seabed [1,2], but these procedures are not without limitations in terms of both the sensors and those imposed by the environment [3].Remote sensing for bathymetry can be subdivided into active and passive techniques.Active remote sensing for bathymetry is generally represented by ship borne multi-beam echo sounding (MBES) sensor arrays [4].MBES surveys produce accurate depth measurements along transects on the seabed but this method is constrained by high operating costs and an inability to survey in very shallow waters, marine protected areas or endangered habitats such as coral reefs [5].Additionally, ship-borne surveys are time consuming and swath widths can be very narrow in shallow waters.Although single beam surveys are less costly than their multi-beam alternative, their low sampling density requires advanced interpolation (or prediction) techniques [6] to estimate seabed topography, and, therefore, are generally not used for high accuracy surveys but rather commissioned by local interest groups as an interim measure.A second non-imaging method known as satellite altimetry can be used to measure the geoidal height and marine gravity field which, in turn, can be used to determine the water depth from the linear relationship between the gravity anomaly and square of the depth [7].This method is only suitable for deep sea, for example, surveying large sea mounts [8].Other active methods such as airborne bathymetric LiDAR (Light Detection and Ranging) have achieved a spatial resolution of 4 m (horizontal) and 20 cm in vertical accuracy at depths of over 30 m.However, LiDAR is not ideal for all water types, as experience gained in Irish waters from the INFOMAR [9] research program demonstrates.These tests resulted in very poor seabed detection along the east coast of Ireland and limited penetration in the west coast up to 15 m.Satellite LiDAR (e.g., ICESat) has been employed by Arsen [10] in conjunction with imagery from Landsat 5 and 7 to ascertain the water levels of inland lakes but this type of LiDAR platform is not designed for high-resolution, high-accuracy bathymetric surveys.
Although [11] has employed spatial video to monitor wave patterns to calculate near shore bathymetry, passive remote sensing for bathymetry is predominantly performed using multispectral satellite imagery.Multispectral imagery is preferable because electromagnetic radiation at different wavelengths can penetrate the water column to different depths and the land-water interface can be clearly defined.Bathymetry is calculated using analytical [12,13] or empirical methods [14,15], based on the statistical relationships between image pixel values and ground truth depth measurements and is applicable in shallow coastal waters.Accuracy is high using these methods for depths of water penetration not exceeding 20 m but becomes less accurate in deeper waters [16].Turbidity is an issue for imagery derived bathymetry in all water types, as [17] have demonstrated in their study of high sediment rivers in tidal inlets [18], and summarized by Gao [16].Hyperspectral satellite data can provide an important insight into the water column properties when deriving satellite bathymetry (e.g., turbidity, oceanographic) [19] due to the greater number and narrower bands, however, the spatial resolution is generally much larger than with multispectral (1 or 2 orders of magnitude).
Optical multi-spectral satellite-derived bathymetry (SDB) that implements analytical or empirical methods, based on the statistical relationships between image pixel values and field measured water depth measurements, apply the general physical principle that sea water transmittances at near-visible wavelengths are functions of a general optical equation depending on the intrinsic optical properties of sea water.A variety of empirical models have been proposed, from linear functions [20], band ratios to log transformed regression models [21], and non-linear inverse models [22] with varying degrees of corrections applied (atmospheric, sun glint, seafloor).The evolution of empirical models has been largely linked to the chronology of satellite platforms: coarse spatial resolution using Landsat TM [13,23,24]; medium spatial resolution SPOT images in shallow waters [15,25] or RapidEye in lakes [26]; and high spatial resolution with the use of commercial satellites such as WorldView [27,28], QuickBird [29,30] and IKONOS [31].
The first and most popular approach for deriving bathymetric estimates from remote sensing imagery was proposed by Lyzenga [20] in clear shallow waters using aerial photographs, identifying that the reflectance from the seabed is dependent on the bottom type and the water depth.Since then, a wide variety of empirical models have been proposed, although depth penetration is limited, water turbidity is an issue and the models require calibration, particularly in areas of variable bottom type.Although [32] have demonstrated an artificial neural network that is not influenced by bottom type or vegetation on the seabed.The Lyzenga method [20] is not restricted to imagery from a single satellite-imagery from Landsat TM [13,21,23,24], QuickBird [29,30], SPOT [15,25] and WorldView-2 [27,28] have all been employed.Nor is the source of the imagery limited to satellites-Flener [33] have performed a comparison between mobile laser scanners and UAV imagery for bathymetric surveys of rivers.Due to the reduced viewing distance achievable with a UAV, this high resolution imagery can result in a ground sampling distance as low as 5 cm.Work presented by Lyons [30] has demonstrated that not only is the Lyzenga method [20] suitable for determining bathymetry but it is also suitable for classifying features on the sea bed such as sea grass.
The methodology used in this research with respect to the remotely sensed imagery has its roots in that applied by Stumpf and Holderied [21] which introduced an algorithm available in ENVI ™ that uses a ratio of reflectance to retrieve depths from IKONOS imagery even in deep water (> 25 m) contrary to a standard linear transform algorithm.Su et al., [22] successfully automated the methodology of Stumpf and Holderied [21], tested with IKONOS images, and implemented it as a module tool within the ArcGIS TM environment.For prediction model choice, we significantly extend that conducted in the related study of Su et al., [34], where the (spatial) geographically weighted regression (GWR) model [35] was shown to provide superior accuracy over a (non-spatial) multiple linear regression (MLR).
For many spatial prediction studies, there exists a wide range of models to choose from (e.g., [36]), and as such, it is rarely sufficient to only compare two models.The skill is to choose a small but representative group of predictors, one of which can be taken as the best.In this respect, exploratory analyses with our study data indicated that the key relationship between the multibeam bathymetry and the reflectance data is not constant across the study area (i.e., the relationship is non-stationary), but also that significant spatial autocorrelation effects are present.Thus, GWR is still considered a good predictor choice, as it accounts for such spatially-varying relationships.However, as autocorrelation effects are also present, we aim to demonstrate a possible improvement to the study by Su et al., (2014) [34] above, by expanding the model comparison from two to five models.One predictor is a hybrid between GWR and kriging (GWRK) [37,38] that captures both spatial effects.The other two predictors stem from the kriging with an external drift (KED) model (e.g., [39]), which, provided it is calibrated within a local neighbourhood (i.e.KED-LN), similarly captures both spatial effects; whilst KED within a global neighbourhood (i.e., KED-GN) only caters for stationary relationships, as with MLR [37,40].Although this particular model comparison is not novel in the wider literature (e.g., see [37,38]), its use in this study's context is.
Thus, a key study aim is to determine to what degree we need to account for our observed exploratory findings in order to provide the most accurate water depth predictions.Are the spatial non-stationarity and spatial autocorrelation effects, equally important (i.e., prefer a KED-LN or GWRK fit)?Alternatively, is one effect significantly more important than the other (i.e., choose GWR in preference to KED-GN or visa-versa)?The MLR fit is not expected to perform well, and simply acts as a benchmark predictor, which the other four models would default to, if spatial non-stationarity and spatial autocorrelation effects were entirely absent in the data.Thus, our study objectives can be summarized as follows:  Evaluate the overall suitability of RapidEye satellite data for deriving coastal bathymetry in a representative C-shaped bay environment, through application of a blue/green band ratio and statistical models using ground calibration points. Determine the most suitable prediction model for the data analysed using a non-spatial, multivariate model versus four spatial alternatives, each catering for a variety of spatial effects. Demonstrate the value in comparing a range of predictors, carefully chosen via a suitable exploratory analysis. Discuss the spatial patterns of the best model's predictions in relation to the bay's physical characteristics.

Study Area and Datasets
The following sections introduce the study area, the satellite imagery employed, the multibeam depth data that was used to calibrate the model and validate it, and the seabed classification maps.

Study Area
The study area (Figure 1) is located in the inner part of Dublin Bay on the east coast of Ireland.The region investigated is a C-shaped inlet, a designated UNESCO Biosphere and covers an area of approximately 50 km² (10 km long and 5 km wide) with water depths ranging from 0-15 m.The bay is joined and bisected by the river Liffey in the south and north.The intertidal region extends in a broad arc around most of the bay, giving a total intertidal area of some 20 km² , in a relatively flat topography interrupted by tidally controlled related features, such as drainage channels and inlets.Beyond the intertidal region, the topography of the bay is characterized by a gentle E-W regional slope not exceeding 0.5°.Near the northern shoreline, off the Howth peninsula, slopes can be, locally, steep up to 5°.The Liffey channel extends ca.6 km E-W and has a U-shape section, with a width between 300 and 1200 m, heights ranging from 6-8 m and wall slopes averaging 1-3°.A number of man-made structures such as pipelines, buoys and anchorage sites are present in the study area described in the nautical charts.Descriptions of Dublin Bay and its intertidal environment are given in earlier papers by several authors [41,42].
The study area has been chosen as a representative test with similar characteristics in terms of geomorphology (e.g., C-shaped bay, estuaries) and socio-environmental factors (e.g., dense city, sea level rise) to other embayment areas worldwide.Data availability issues with recent hydrographic bathy2metry surveys together with suitable satellite imagery have also been a key factor.

Multibeam Bathymetry Data
The groundtruthing water depth (WD) data used in this study was acquired during June-September in 2009 as part of INFOMAR, the Irish national seabed mapping program (www.infomar.ie).The system employed onboard the R.V. Keary was the multibeam Kongsberg-Simrad MBES (EM3002) operating at circa 300 kHz [43].Data was collected using differential GPS and tidal corrections were applied in the post-processing.Depths were reduced to LAT (Lowest Astronomical Tide) and deemed suitable for hydrographic charts.The complete system was capable of measuring water depths between 0 and 15 m with a vertical accuracy of less than 10 cm RMS and a horizontal positioning accuracy of less than 1 m.These data are used as the response variable in our study models.

Satellite Imagery
The (model predictor) image used for this study was acquired by the RapidEye satellite constellation.RapidEye imagery was selected because it was the most suitable in terms of spatial and spectral resolution and also time of image capture.Figure 2 displays the RGB RapidEye image of the study area, characterized by calm sea conditions and a cloud free atmosphere (2% cloud-cover).

Spatial Resolution
There are five pushbroom satellites in the commercial RapidEye constellation, providing a daily off-nadir repeat cycle and a 5.5 day repeat cycle at nadir.The ground sampling distance of the multispectral sensors on board these satellites is 6.5 m but this is re-sampled to provide users with a 5 m spatial resolution for Level 3a orthorectified imagery.A 5 m spatial resolution will enable subtle changes in bathymetry to be identified and quantified (such as bedforms and channels), whereas larger pixel sizes such as a 30 m Landsat 8 pixel will cause a greater degree of generalisation/smoothing to be introduced and steep submerged gradients may not be resolved, leading to errors in bathymetric charts.The 5 m spatial resolution also corresponded with the 5 m MBES grid spacing and each pixel could then be matched with a corresponding MBES point.

Spectral Resolution
The spectral range of the RapidEye satellite begins at 440 nm, a comparable wavelength with the coastal/aerosol band of Landsat8 (433 nm), maximising radiance measurements from the seabed.Although RapidEye has a dedicated "Red-Edge" band at 690 nm-730 nm for vegetation studies that is sensitive to changes in chlorophyll content, we used the NIR band (760 nm-850 nm) in this study, as it outperformed the "Red Edge" band in delineating the land-sea interface.

Time of Imagery
The image employed in these tests was captured on 1 June 2009.RapidEye's sun-synchronous orbit and a 12:13 UTC image capture would correspond with low tide in Dublin Bay on that day (12:05 UTC).There were three important considerations when choosing the date of the imagery: ensuring the image corresponded as closely as possible with the date of the INFOMAR MBES survey, coinciding with low tide to minimize the distance light would have to penetrate through the water column and a high illumination angle which would have the same effect.

Seabed Classification Data
Seabed type information was derived from the seabed geological maps and databases published by the Geological Survey of Ireland (www.gsi.ie).These maps, produced by interpreting multibeam bathymetry and backscatter data, inform about seabed type and geomorphological factors.Data points showing similar characteristics are grouped into discrete classes (see Table 1).Sediment samples are used to label these classes with geological descriptors.This dataset provides a further predictor variable in our study models.Fine-grained sediments (classes 3, 4 and 5) account for over 92% of the Bay's seafloor.Hardgrounds (class 1) and Channel class (class 2) account for 4% each.

Radiometric and Geometric Corrections
Satellite imagery such as Landsat or RapidEye are generally stored in a database and classified by the degree of processing.This ranges from Level 0, a raw, uncorrected image up to Level 4, where all corrections have been applied and a series of analyses (such as vegetation indices) have been applied to the imagery.The RapidEye imagery used in this study was provided at Level 3a, meaning that all radiometric and sensor corrections had been performed.Level 3a imagery has also been aligned to a cartographic map projection and geometric corrections applied using fine DEMs with a post spacing of between 30 and 90 m and Ground Control Points (GCPs).RapidEye satellite images are collected with a bit depth of up to 12 bits and stored on-board the satellites.They are then scaled to a 16 bit dynamic range.Scaling converts relative pixel DNs into values directly related to absolute at-sensor radiances.A scaling factor is applied so that these post-processed single DN values correspond to 1/100th of a W/m 2 sr µm.The digital numbers of the RapidEye image pixels represent the absolute calibrated radiance values for the image.

Atmospheric Corrections
The main focus of this research was to assess the suitability of a number of spatial models to predict relative water depth (WD) using an already tested band ratio algorithm and calibration data.Atmospheric corrections may pose unnecessary bias in accurately assessing the spatial models due to cumulative increase in error uncertainty.Several studies focusing on measuring water quality using Landsat sensors have questioned the advantage of using atmospherically corrected data when attempting to describe the variability of optical water properties [44,45].Additionally, by using a single satellite image, and with the atmospheric conditions present, it is a reasonable assumption that they will not be noticeably different across the test area.

Contribution of Sun-Glint as an Error Source
Sun-glint can appear at the slope of waves and this potentially influences the returned signal.Sun-glint is a problematic issue in bathymetric surveys with satellite imagery.For example, Goodman (2008) [46] demonstrated that sun-glint in high resolution imagery can result in errors of up to 30% of the water depth.It is most common where the sensor is facing towards the sun [47], however, with a relative azimuth of 51° (solar illumination elevation angle minus the satellite view angle) significantly less than 90°, it can be anticipated that sun-glint was not a significant issue for this RapidEye image [48].To further confirm this conclusion, three additional tests were carried out: (a) A visual inspection of the image-the RGB image of the test site was inspected to identify the presence of sun-glint.None were apparent.However the spatial resolution could potentially have masked the effect.(b) In theory, energy in the NIR portion of the spectrum should be absorbed in water, and therefore any NIR that is recorded over water is due to sun-glint.In reality, there is always a small portion of NIR recorded by the sensor and it rarely equates to a DN of 0 [49].The NIR values for the test site were inspected looking for significant variation which would imply areas of sun-glint.None were identified.(c) A final dual-wavelength test was designed to ensure no sun-glint was present in the area.A process developed by Hedley et al., [50] demonstrates a method for removing sun-glint and this was adapted and used to assess whether sun-glint was a significant contributory factor.An area of deep water, where the spectral brightness could be considered as homogenous, was located.The values for the Blue band in the visible portion of the spectrum and the NIR were then plotted against each other and a linear regression line was applied to the data.Here a very low correlation of 0.08 between NIR and Blue band values implied that minimal sun-glint is present, as the NIR values do not increases as the Blue band values increase.The slope of the linear regression line is used by Hedley et al., [50] to apply a de-glinting correction to the pixels.However, our results and the clear shallow slope of the linear regression line from the test area demonstrate that this is not necessary.

Log Ratio Algorithm for Satellite Derived Relative Depth
The method to derive the relative radiance for the satellite data was based on the methodology adopted by Stumpf and Holderied [21].The Relative Water Depth tool enabled in ENVI™ 5 suite was used to extract the log ratio between the green band (520-590 nm) and the blue band (440-510 nm).The algorithm uses a ratio of observed reflectance and two constants to calculate satellite derived relative depth (SDRD), as follows: Rw is the observed reflectance of the wavelength (ë ) for bands i and j, the ratio of which forms the basis for depth extraction.The blue (λi) and green (λj) bands were used for the ratio input as they have the greatest penetration though the water column.The constant, n, is a fixed value, chosen to keep the ratio positive given any reflectance value input.The output must still be calibrated to field data to estimate actual depth.m1 is a tunable constant to scale the ratio to depth, while m0 is the offset correction when the output should be zero (i.e., SDRD = 0), used most recently by Pattaniak et al., [51] and Jawak and Luis [52] for Indian Ocean and Antarctic Ocean coastal surveys.The application of this band ratio algorithm has led to improvements in the water depth empirical models, particularly dealing with seafloor reflectance issues or water column issues [53].However, limitations in the implementation are still hampered by its complexity, particularly on the physics of light transmission within turbid waters [17,54] and in biota covered bottoms as details in Lyons et al., [30].The output SDRD data was exported to vector point data using the pixel centroid as the geo-reference locator.

Data Integration
The geo-referenced SDRD values (our first predictor variable) were combined with the multibeam WD (response) data, together with the categorical seabed type (our second predictor variable), using the nearest neighbor algorithm in ArcGIS™ 10.This resulted in a full data set of 12,665 observations.In order to efficiently compare the accuracy of our prediction models with respect to reduced computational overheads, we then decimated this data by randomly removing 80% of the observations.For objective model comparison, this decimated data set was then randomly split into a model calibration data set of 632 observations and a set-aside model validation data set of size 1898 observations.Here, we choose a 25:75% split as some aspects of model calibration are still computationally expensive (but not prohibitively so).Data processing operations were conducted within the R statistical computing environment (http://www.r-project.org), as were the implementation of the prediction models in the following sections.

Prediction Models
We calibrate and assess five multivariate prediction models, each of which try to accurately predict multibeam WD using SDRD data and seabed-type, as two predictor variables.Preliminary analyses indicated that spatial autocorrelation effects are present and that the relationship between WD and SDRD changes across the study waters.With these initial findings in mind, we choose the following predictors for evaluation: (1) MLR; (2) GWR; (3) KED-GN; (4) KED-LN; and (5) GWRK (i.e., kriging with GWR as its trend component).Essentially, MLR assumes stationary (constant) relationships between the response and predictor data, as does KED-GN.On the other hand, GWR, KED-LN and GWRK each model the same relationships, as non-stationary.In addition, KED-GN, KED-LN and GWRK each account for spatial autocorrelation effects, whilst MLR and GWR do not.Thus, MLR is the only non-spatial model of the five; and only KED-LN and GWRK model both non-stationary relationships and autocorrelation effects.A summary of these properties is given in Table 2.The five study predictors are now formally described.  +   with sample data denoted by α = 1, ⋯ , ; and where   are the residual data.
Using ordinary least squares to find the estimator , a prediction from MLR at a target location  is found from: The GWR model can be defined as   =  0 (  ) + ∑    =1 (  )  +   , where   (  ) is a realisation of the continuous function   () at sample location α.In particular, a localised MLR is calibrated at any location , where observations close to  are geographically weighted (GW) according to some kernel function.Thus, GWR parameters are estimated using a weighted least squares (WLS) approach with weights changing according to location.A prediction from GWR at a target location  is found from: For this study, the weighting matrix in GWR is specified using a bi-square kernel whose optimal bandwidth is found in an adaptive form using leave-one-out cross-validation.Here, the squared prediction error (or cross-validation (CV) score) is calculated for a range of bandwidths and the bandwidth that gives the minimum CV score is considered optimal.
For KED, () is modelled as an unknown using the linear function () = ∑    =0 () (i.e., a MLR fit with  0 () = 1), where it is filtered from the linear predictor by using constraints to give: as a KED prediction at .If () is modelled as an unknown constant, then an ordinary kriging (OK) prediction at  results: ̂  () = ∑     =1 ()(  ).The GWRK predictor at a target location  is the sum of the GWR prediction from Equation (3) and the OK prediction of the residual: GWRK is an explicit model where the mean and residual processes are dealt with separately in a two-stage procedure.KED deals with the mean and residual processes in an implicit fashion where all model equations are solved at once.
Kriging is always defined with the residual covariogram   (), but when the mean is taken as some constant with OK, only the raw data covariogram () is needed.Furthermore, and as is common practice, we find variograms () instead of covariograms (where () = (0) − () is used to relate the two).Restricted maximum likelihood (REML) is used to find relatively unbiased variogram parameters for KED.Parameterization via REML is not suited to GWRK and as such, the residual variogram   () from the GWR fit is first estimated with the usual classical estimator (e.g., [56], pp.153-154) and then modelled using a WLS fitting approach.After some initial experimentation, only exponential variogram model-types are considered, i.e., (ℎ) =  0 +  1 (1 − exp (−ℎ/)); which caters for a (small-scale) nugget variance  0 ; a (large-scale) structural variance  1 (where  0 +  1 =  2 ); and a correlation range α.
All forms of kriging can be applied using local prediction neighbourhoods as opposed to a unique, global prediction neighbourhood; an approximation that can account for local mean fluctuations and ease computational burden.In this study, KED is not only applied in its correct and unbiased form using a global neighbourhood (KED-GN), but also applied using local neighbourhoods (KED-LN), as this usefully caters for relationships that may vary across space (as now the MLR component fit is also local).An optimal neighbourhood for KED-LN is found using a cross-validation procedure that is analogous to that used in GWR for finding its bandwidth (but where the root mean squared error, RMSE, is reported instead).Further details on the calibration of all five study models can be found in Harris et al. [37,57]; Harris and Juggins [38].
Observe that we model spatial non-stationarity as a first-order mean-response effect (via GWR or the trend components of GWRK and KED-LN), whilst spatial autocorrelation is a second-order variance-response effect (modelled via the residuals in GWRK and KED); but as in any single realization process, their exists an analytical impasse in identifying one spatial effect from the other, that can never be satisfactorily resolved (e.g., [55,56]).One consequence of which is that a localised predictor such as GWR will often similarly account for spatial autocorrelation effects even though it is not specifically designed to do so.This is analogous to how a basic predictor such as inverse distance weighting will often provide comparable results to a more sophisticated predictor such as simple or ordinary kriging (e.g., see Cressie [58]).Finally, we do not attempt to model anisotropic effects, either with respect to the shape of the kernel in GWR or the variogram in kriging.

Model Validation
The prediction accuracy of each study model is measured by: for the mean prediction error (MPE), the root mean squared prediction error (RMSPE), and the mean absolute prediction error (MAPE), respectively.The correlation coefficient (Cor-Coef) between (  ) and (  ) is also found.Here, (  ) refers to the actual WD data at the model validation sites, whilst (  ) refers to the predicted WD data at those sites, and  = 1898 is the size of the study validation data set.These model validation statistics are supplemented with various plots and maps to provide a rich picture of a given model's prediction accuracy.

Exploratory Analyses with the Calibration Data
In the first instance, a box-cox transformation of the SDRD data (renamed to SDRD.T) is applied, as it very slightly improves the linearity between it and WD (with a small correlation improvement of −0.87 to −0.88).The transform results in an estimated exponent of lambda = 2.5 (i.e., this transform is applied, SDRD 2.5 ).Regardless of whether a transform is applied or not, it is clear that the SDRD data will be a strong predictor of WD.Histograms and scatterplots of this analysis are given in Figure 3a-e.Note also, that the correlations between WD and the coordinate data are −0.69 and 0.23 for Eastings and Northings, respectively.This confirms the expected East-West trend in WD, as WD in the Bay generally increases towards the East, as we move further from the shore.
Conditional boxplots for WD and SEABED display a worthy WD discriminating value for this predictor variable, especially for seabed class 2 (Figure 3f).The use of this categorical predictor variable in addition to SDRD.T also provides a stronger exploratory MLR fit; The Akaike Information Criterion (AIC) without this variable is 395.3, whilst AIC with this variable is reduced to 375.5.Thus, SEABED is retained as a predictor variable in addition to SDRD.T, and all five study prediction models will be calibrated as such.The spatial distributions of WD, SDRD.T and SEABED are given in Figure 4. Visually, both predictor variables appear suited to help explain the variation in WD, in some way.
In order to investigate relationship non-stationarity, GW correlations are found for WD versus SDRD.T, firstly using (a narrow) 10%, and secondly using (a wide) 50% bandwidth (each with bi-square kernels).Note that the bandwidths for the GW correlations are user-specified and are not optimally found.Details on how to conduct a GW correlation analysis can be found in Harris et al., [59], and follow a similar procedure to a GWR analysis.From Figure 5, it can be seen that this relationship varies across space, where the relationship tends to get weaker as water depth increases (see Figure 4a).This local analysis confirms the value in a calibrating a non-stationary relationship predictor (i.e., GWR, GWRK and KED-LN).Observe that we have not explored the non-stationary relationship between WD and SEABED.

Model Calibration
To calibrate GWR and GWRK, an optimal bandwidth is first found for GWR via cross-validation.The same GWR fit is then used as the trend component of GWRK, and the corresponding residual variogram is found for GWRK.To calibrate the KED models, residual variogram parameters are found via REML, which are then used to parameterize both KED forms, but where an optimal local neighbourhood is found for KED-LN via cross-validation.Variograms for KED and GWRK (Figure 6a,b), both exhibit spatial structure, indicating some value in accounting for spatial autocorrelation effects.Similarly, the cross-validation functions (Figure 6c,d) for the bandwidth in GWR and the neighbourhood in KED-LN, both reach a minimum, indicating some value in accounting for non-stationary relationship effects.As is expected, the variogram structure is much stronger for KED than for GWRK, and the bandwidth/neighbourhood function is much steeper for GWR than for KED (see discussions given in Harris et al. [37]).Interrogating model calibration functions is important, as it provides context to model implementation, as well as helping to interpret the results.Note that for a visual comparison only, we have presented the KED and GWRK variograms using the same WLS variogram fitting procedure, but KED actually used a REML variogram fit.

Prediction Accuracy at the Validation sites
The prediction accuracy results for the five study models are given in Table 3, indicating that KED-LN is clearly the best predictor of water depth, in this instance.For model choice, this result concurs with the simulation study of Harris et al. [37] using the same set of prediction models.Thus, it is important to cater for non-stationary relationship effects (at least between WD and SDRD.T, but also for WD and SEABED) and for spatial autocorrelation effects-both of which KED-LN does.GWRK also does both aspects, but clearly not as well.The poor performance of GWRK can be explained by its focus on modelling non-stationary relationship effects, rather than spatial autocorrelation effects (as shown by the well-behaved GWR bandwidth function in Figure 6c, combined with the weak structure of its residual variogram in Figure 6b).Considering the relatively good performance of KED-GN (our second best predictor), it is clear that, in this instance, accounting for spatial autocorrelation effects is much more important than accounting for non-stationary relationship effects.As expected, MLR is the poorest performer, and we observe also that GWRK only provides a marginal improvement over the simpler GWR fit (which again can be attributed to the observed weak structure in the GWRK residual variogram in Figure 6b).

Analysis of KED-LN Performance
Prediction accuracy plots and maps are given for KED-LN in Figure 7a-c.Clearly the largest prediction errors occur around the deep channel and can be a result of both over-and under-prediction (Figure 7b).There is a significant change in polarity from the western mouth (negative) to the eastern (positive), coupled with a decrease in the amount of anomalous points.A significant number (circa 0.4%) of over-predictions also occur to the far edges, mostly north and to a lesser extent south (Figure 7b).A GW correlation map of actual WD (i.e., the MBES WD data at the validation sites) versus predicted WD (i.e., the KED-LN predictions of WD) is given (Figure 7c) and needs to be compared with the corresponding global correlation of 0.985 (from Table 2).Again, the poorest correlations are centrally located, but an area of poor correlation is also present off Howth (the NE corner).
Conditional accuracy plots for KED-LN are given in Figure 7d,e, where it appears that in general KED-LN's performance does not strongly depend on depth or seabed type.Some notable residual dependencies can be observed though.For example, many of the largest (positive) residuals from KED-LN occur in shallow waters, near the channel (i.e., prediction is deeper than it should be); the other area, off Howth as mentioned above, shows a small cluster of large positive residuals with similar values, near the shoreline coinciding largely with SEABED class 1. SEABED class 2 which largely coincides with the channel (see Figure 4c), also provides the most biased condition boxplot, followed by SEABED class 1, largely along the edges of the bay and associated to some rock outcrops and hardgrounds.

Data Relationships
The linear relationship between SDRD.T and WD was found to be strong with a high negative correlation coefficient of −0.9.This relationship is global reflecting the bay as a whole.However, in investigating this relationship locally, it was found to vary across the bay, where the relationship weakened according to increasing water depth.This heterogeneity in the SDRD.T to WD relationship was thus accounted for in three of the five study predictors (GWR, GWRK and KED-LN), whilst the relationship is naively assumed to be homogeneous in the remaining predictors (MLR and KED-GN).Seabed class was also found to be a good discriminator of WD but this relationship was not assessed across the bay.As such, this relationship was simply modelled locally in GWR, GWRK and KED-LN; but globally in MLR and KED-GN.Spatial autocorrelation in the residual data was also observed and accounted for (in a global fashion only) in the GWRK, KED-LN and KED-GN predictors.Local, heterogeneous autocorrelation was not investigated but could have been using predictors such as those presented in Harris et al. [40].

Performance of this Study's Most Accurate Predictor (KED-LN)
The spatial distribution of the residuals from KED-LN predictions and the spatial distribution of the local (GW) correlations between the actual and predicted WD (presented in Figure 7b,c, respectively) shows a distinct spatial pattern linked to the general bay's geometry, topographic controls and bottom type.The local correlations are stronger and more consistent in the southern section, and in a smaller extent, in the northern section, both areas characterized by low complex topography and homogeneous fine-grained seabed type (classes 3 and 4).Weak correlation and higher residuals patterns are largely associated with high relief controls (Liffey channel) and the edges of the bay (very shallow waters or occasional hardgrounds in class 1).
KED-LN's minor dependency on depth can be observed.Under-prediction occurs mainly in the Liffey channel (observe the red dots of negative residuals in Figure 7b and the points on the bottom left-hand quarter of Figure 7d).One possible explanation is that the signal reflected is largely absorbed, or scattered in the water column, with just a residual component (or inexistent) coming from the seafloor bottom albedo.Therefore, the water-leaving signal represents closely the maximum penetration in the water rather than true depth to bottom.Differences in water column properties could be the main cause for the increase in absorption or scattering effects in these locations.Under-prediction could also be caused by local changes in the seabed type.
Seafloor bottom reflectance is an important component of the overall water-leaving signal in coastal waters [24].The assumption of a uniform bottom type is generally not appropriate in coastal areas due to the dynamic environment near the shoreline characterised by complex sediment and oceanographic patterns.The inclusion in our models of seabed class, as a categorical predictor variable, provides some significant results to further enhance the debate and becomes an initial attempt to better constrain the contribution of the bottom reflectance in the overall scheme.
The central Liffey channel is where the model presents larger uncertainties, larger residuals and reduced accuracy.The rough geomorphological characteristics of the channel and the high backscatter levels indicating harder and/ or rougher seafloor can influence the bottom reflectance contribution to the water-leaving radiance becoming more complex and generally larger than the more homogenous surroundings.Additionally, higher turbidity due to strong sediment fluxes along the navigational channel might also play a role increasing local variability in the water column.This is to a certain extent captured in the conditional boxplots for SEABED class 2 in Figure 7e, with some large negative residuals reflecting large under-predictions.
Rock outcrops and hard bottoms are present in the region approximately covering 5% of the study area.These hard seafloor patches, mostly falling in seabed class 1, should have a larger component from the seafloor bottom albedo than the surroundings primarily due to the greater optical contrast in the water-seabed interface.This is true in the rock outcrops from the northern sector, where the KED-LN predictions generally show consistently positive residuals (prediction is deeper than it should be).
The southern half of the bay is where the KED-LN model preforms better with high (actual WD versus predicted WD) correlation coefficients (r ≥ 0.98), low residuals (residuals ± 0.5 m) and the narrowest residual boxplot (class 3, in Figure 7e).This area is characterised by a smooth topography, largely featureless and with homogenous sediment type.Additionally, the area also benefits from a low dynamic environment compared to the other areas, linked to stable water properties.These conditions are all favourable for a robust model fit.

The Value of our Model Comparison Exercise and Its Transferability
It has been worthwhile to compare a number of statistical predictors, each covering a range of attributes; attributes decided upon via a focused global and local exploratory analysis.We chose models that could account for: (i) global relationships between our variable of interest (WD) and its predictors (SDRD.T and SEABED class); (ii) local relationships between the same variables; and (iii) spatial autocorrelation effects.Via a comprehensive model accuracy assessment, a standard geostatistical model (KED-LN) was found to be the best predictor of water depth.This model was one of two study models (the other being the relatively new, GWRK model) that could capture both local relationships and spatial autocorrelation.This study's group of five prediction models should be readily transferable to similar embayment studies with similar physical characteristics, for predicting water depth.However, it should not be expected that KED-LN would similarly provide the best predictions at all times, but only that one predictor of the group is likely to provide similar and worthwhile levels of accuracy as that found here.Thus the transferability of our approach refers to group of five models as a whole, and not one individual model in particular.That said, our experience suggests that KED will tend to provide the best results; reflecting its theoretical property as an optimal linear predictor (e.g., [39]).Observe also that our approach is suited only to situations where ground measurements exist so that the models can be calibrated; the parameters of this study's models are not transferable to ungauged areas and should never be considered so.

Expected Effects of Reduced Model Calibration Sample Size on Prediction Performance
It is useful to discuss the likely effects of reduced model calibration sample size on each of our study predictors.As sample size decreases, the prediction accuracy of all predictors would degrade, but not necessarily in a uniform manner across the five model fits.Here, a reduction in sample size is likely to create a problem in selecting the bandwidth for GWR and similarly the neighbourhood for KED (i.e., non-stationary relationships cannot be determined due to diminishing local information).As a result, the optimal bandwidth or neighbourhood will tend to a global form.Thus, GWR prediction accuracy would tend to that of the corresponding MLR model, whilst KED-LN prediction accuracy would tend to that of the corresponding KED-GN model.These effects will be more pronounced in KED-LN as its trend component can be viewed as GWR with a box-car kernel whose localness is determined using only un-weighed local data subsets; whereas GWR with a distance-decay kernel, such as the bi-square, can still model locally whist utilising all available information (see Harris et al.) [57].In addition, KED's residual variography would also suffer from diminishing information, ultimately resulting in a pure nugget variogram.Thus, KED's prediction accuracy would also then tend to MLR in this instance.Similar remarks can be made with respect to GWRK whose prediction accuracy would also tend to that of MLR as sample size reduces.Overall, MLR would be the least affected by a reduction in model calibration size, as its non-spatial form requires the least information for reliable model parameter estimation.

Further Considerations
Further work could refine this modelling approach to provide accurate prediction confidence intervals using a Bayesian construction of the KED model (e.g., Pardo-Igúzquiza and Dowd [60].In this study, we deliberately chose not to report these intervals, as it is well-known that those from the standard KED model are of little practical use (e.g., Harris et al., [57]).Assessing predictions at different time periods would also be worthwhile, where the predictors of this study could either be re-applied at each time interval, or alternatively, a full space-time prediction model could be used, again using Bayesian concepts (e.g., [61]).More detailed investigations of the satellite reflectance data may also be worthy, with possible extensions to hyperspectral imagery.Considering the localised theme of this study, a local dimension reduction technique for such high-dimensional imagery may be useful; for example, a GW principal components analysis (e.g., [62]).Finally, if it is known that the study area exhibits well defined discontinuities, then the prediction models should be tailored accordingly.For example, the GWR model could be adapted following that outlined in Gribov and Krivoruchko [63].

Conclusions
RapidEye multispectral sensors were primarily designed for land applications, however, in this study they are used satisfactorily for bathymetric mapping in a representative coastal embayment.This study presents a promising statistical methodology for predicting coastal bathymetry from the water-leaving radiance signal and field calibration data (i.e., water depth measurements and seabed classification).The results confirm that RapidEye datasets are well suited to infer depths up to 12 m using spatial prediction models calibrated with a limited number of water depth measurements (50 depth measurements /km² ).The best model's prediction accuracy, within a range of ±1 m, is 95.0%.
Our assessment reveals that the four spatial models of this study show better adjustments in the basic non-spatial model in the predictions.Prediction accuracy results indicate that a kriging with an external drift model using local kriging neighbourhoods is clearly the best predictor, stressing the importance of catering for both spatial autocorrelation and non-stationary relationships.For this predictor, the correlation coefficient between the actual and predicted water depth data is very strong at 0.985.
In coastal areas, ship-based multibeam or LiDAR bathymetry surveys attaining 100% coverage are sometimes unavailable or prohibitively expensive, while the dynamic nature of the coast makes precision and repeatability a challenge.Spatial prediction models can provide reasonable and error controlled water depth predictions in an inexpensive and efficient manner from a relatively low number of groundtruthing points.The prediction errors presented in this paper are too large to be used in high accuracy products such as navigational charts; however, with the right quality controls applied, they can be applied in a range of important fields from environmental monitoring to seabed mapping and coastal management.

Figure 1 .
Figure 1.Study Area-Dublin Bay, East coast of Ireland.The Liffey channel is clearly centrally located.

Figure 6 .
Figure 6.Model calibration plots: (a) empirical residual variogram for KED with WLS fit, (b) empirical residual variogram for GWRK with WLS fit, (c) GWR bandwidth function with optimum at 55 nearest neighbours, (d) KED neighbourhood function with optimum at 135 neighbours.

Figure 7 .
Figure 7. Accuracy plots and maps for KED-LN at validation sites: (a) scatterplot of actual WD versus predicted WD, (b) map of residuals (actual WD-predicted WD), (c) GW correlation map for actual WD and predicted WD, (d) scatterplot of actual WD versus residuals (with useful thresholds), and (e) conditional boxplots for residuals with respect to seabed class (with useful thresholds).Observe that WD is always measured negatively.

Table 1 .
Seabed type and descriptions.

Table 2 .
Core properties of this study's prediction models.