Comparative Analysis of Selected Geostatistical Methods for Bottom Surface Modeling

Digital bottom models are commonly used in many fields of human activity, such as navigation, harbor and offshore technologies, or environmental studies. In many cases, they are the basis for further analysis. They are prepared based on bathymetric measurements, which in many cases have the form of large datasets. Therefore, various interpolation methods are used for calculating these models. In this paper, we present the analysis in which we compared selected methods for bottom surface modeling with a particular focus on geostatistical methods. The aim was to compare five variants of Kriging and three deterministic methods. The research was performed with real data acquired with the use of an autonomous surface vehicle. The collected bathymetric data were reduced (from about 5 million points to about 500 points) and analyzed. A ranking approach was proposed to perform a complex and comprehensive analysis integrating typically used error statistics—mean absolute error, standard deviation and root mean square error. This approach allowed the inclusion of various views on methods of assessment while integrating various metrics and factors. The results show that geostatistical methods perform very well. The best results were achieved with the modifications of classical Kriging methods, which are disjunctive Kriging and empirical Bayesian Kriging. For these two methods, good statistics were calculated compared to other methods (for example, the mean absolute error for disjunctive Kriging was 0.23 m, while for universal Kriging and simple Kriging, it was 0.26 m and 0.25 m, respectively). However, it is worth mentioning that interpolation based on radial basis function in some cases is comparable to Kriging in its performance. The proposed ranking approach was proven to be useful and can be utilized in the future for choosing and comparing DBMs, mostly in mapping and analyzing seabed changes, for example in dredging operations. The research will be used during the implementation of the new multidimensional and multitemporal coastal zone monitoring system using autonomous, unmanned floating platforms. The prototype of this system is at the design stage and is expected to be implemented.


Introduction
Getting to know the bottom of a water region is essential not only for the safety of navigation but also for environmental and industrial issues. Bathymetric surveys are made for construction or dredging in ports, designing the routes of cables and pipelines, discovering minerals or observing and controlling environmental changes (erosion, siltation). An example of using bathymetric data for environmental analysis can be seen in [1]. The authors investigated the spatiotemporal variations of water level fluctuations and tidal amplification. In addition, the authors developed a conceptual model of interactions between the river, tide and morphological evolution for shallow systems (it is worth emphasizing that the data collected for these studies also refer to the shallows). For those data analysis, searching trends and modeling [22][23][24]. This causes more workload than in the case of other methods, but on the other hand, the faster analysis does not have to be more accurate.
There were also some trials to improve the ordinary Kriging method, by providing modifications to the original algorithm. For example, in [25], two variants of Kriging were investigated-ordinary Kriging and universal Kriging for 3D Bathymetric models. In [21], the same authors analyzed deeply the parameters to demonstrate that the level of accuracy that can be achieved depends crucially on the choice of one of these parameters, that is, the mathematical model of the semivariogram. In [26] authors propose a new modification, called partitioned ordinary Kriging, improving its applicability for online updates for autonomous vehicles. On the other hand, another modification called growing radius Kriging is proposed [24]. The purpose of the optimization was to significantly decrease computation time while maintaining the highest possible accuracy of the created model. It can thus be noticed that the Kriging method is willingly analyzed in literature for digital bottom modeling and for comparison with other methods. Usually, two or three methods are compared, and one or two types of Kriging are used. One can, however, notice gaps in the existing literature, which gives room for further research. A complex and comprehensive approach to geostatistical DBM validation can be an added value, as the publications show mainly the selected aspects of the problem, comparing Kriging in general to other methods or going deeper into the Kriging itself. Therefore, the research questions we would like to answer are-what is the influence of various Kriging factors on the DBM, also comparing it to other methods and how to perform a complex and effective verification of the methods?
In any kind of analysis that is relying on a comparison of the accuracy and efficiency of spatial digital models, the key aspect is to choose proper methods and metrics for assessment. They are usually based on geostatistical analysis for some selected points or entire datasets in the model. Typically used metrics are root mean square error (RMSE) and MAE (mean absolute error) [18][19][20]26] or root mean square standardized error (RMSSE) [15,21]. Sometimes other statistics are used, such as minimum and maximum values [10,25] or standard deviation [19,21,25]. In some cases, authors provide their own proposals for the comparison metrics. For example, in [24] the percentage of blank nodes as well as the computation time appeared. In [11], the correlation coefficient is used together with the index of agreement. In [21], prediction standard error is used. In many cases [10,11,25], also cross-validation is used. In [27], the author provides a complex approach to accuracy, comparing first the technologies and then analyzing geostatistical errors. Finally, the Monte Carlo method is proposed to estimate horizontal errors. It can be concluded that many approaches are met for accurate assessment; however, there is still room for rational, complex assessment methods by combining a few factors.
In this study, we have two goals. First, we would like to present a comprehensive comparison of geostatistical and deterministic methods for bathymetric data by performing interpolation surfaces. We analyzed five variants of the Kriging method and three deterministic methods-IDW, RBF and GPI. Especially the last two are not very often met in publications, as well as a variety of Kriging. Secondly, we would like to perform complex verification with the use of various statistics and factors. For this reason, we propose a verification ranking, which allows a comparison of various aspects of the models. The idea of a comprehensive analysis stems from the publications cited earlier-until now, no analysis has been realized that took into account so many factors simultaneously.

Research Methodology
Geostatistical modeling consists of several steps. Figure 1 shows the diagram of the methodology used in this work.

Research Methodology
Geostatistical modeling consists of several steps. Figure 1 shows the diagram of the methodology used in this work. The organization plan and collection of the data are described in part 2. Exploratory Spatial Data Analysis (ESDA) is performed to determine basic statistical parameters (the mean, standard deviation, coefficient of variation and maximum and minimum values), identify outliers and calculate the data distribution [28]. Structural analysis, or spatial continuity analysis, is made to determine the correlation between the data. Based on the semivariogram function, the dependency of distance and direction between these data is checked. It is necessary to model a variogram function properly. After the completion of verification, the next step is choosing one of the interpolation methods and making a prediction. Then the predictions are compared and rated to draw conclusions.
The steps from ESDA to assessing predictions were done in ArcGIS software. ArcGIS has a dedicated extension for geostatistical analysis. This research was not only a geospatial and visualization platform but was also used to create statistics for comparative analysis.

Study Area
Bathymetric data were collected during the survey mission on 21 March 2022 in Lake Klodno on the side of Zawory village, Pomorskie Voivodeship, Poland (red dot in Figure  2). The survey was the test of the unmanned vessel HydroDron-1 at the same time ( Figure  3). HydroDron-1 is an innovative craft because of its autonomy-it is a product of Marine Technology, the company dealing with collecting and processing bathymetric data. The equipment of HydroDron-1 includes an integrated bathymetric and sonar system, data acquisition computer, inertial navigation system, profilometer and sound speed meter in the water, single-beam dual-frequency sonar, single-beam high-frequency sonar, LiDAR, radar and GNSS receiver. Data for analysis were collected using the interferometric sonar PING 3DSS-DX-450 ( Figure 3). The depth measurement involves determining the time describing the way of the hydroacoustic wave from the transducer to the bottom along with the return to the transducer. To determine the value of depth, it is also necessary to know values such as the time of the wave pulse, the immersion of the transducer and the speed of sound in the water, which were measured in this integrated system [29]. The organization plan and collection of the data are described in part 2. Exploratory Spatial Data Analysis (ESDA) is performed to determine basic statistical parameters (the mean, standard deviation, coefficient of variation and maximum and minimum values), identify outliers and calculate the data distribution [28]. Structural analysis, or spatial continuity analysis, is made to determine the correlation between the data. Based on the semivariogram function, the dependency of distance and direction between these data is checked. It is necessary to model a variogram function properly. After the completion of verification, the next step is choosing one of the interpolation methods and making a prediction. Then the predictions are compared and rated to draw conclusions.
The steps from ESDA to assessing predictions were done in ArcGIS software. ArcGIS has a dedicated extension for geostatistical analysis. This research was not only a geospatial and visualization platform but was also used to create statistics for comparative analysis.

Study Area
Bathymetric data were collected during the survey mission on 21 March 2022 in Lake Klodno on the side of Zawory village, Pomorskie Voivodeship, Poland (red dot in Figure 2). The survey was the test of the unmanned vessel HydroDron-1 at the same time ( Figure 3). HydroDron-1 is an innovative craft because of its autonomy-it is a product of Marine Technology, the company dealing with collecting and processing bathymetric data. The equipment of HydroDron-1 includes an integrated bathymetric and sonar system, data acquisition computer, inertial navigation system, profilometer and sound speed meter in the water, single-beam dual-frequency sonar, single-beam high-frequency sonar, LiDAR, radar and GNSS receiver. Data for analysis were collected using the interferometric sonar PING 3DSS-DX-450 ( Figure 3). The depth measurement involves determining the time describing the way of the hydroacoustic wave from the transducer to the bottom along with the return to the transducer. To determine the value of depth, it is also necessary to know values such as the time of the wave pulse, the immersion of the transducer and the speed of sound in the water, which were measured in this integrated system [29]. Data were measured from seven measurement profiles in total, comprising two separate measurement missions; two types of profiles were created-one mission for profiles parallel to the shore and one mission for profiles perpendicular to the shore. All data were surveyed in the WGS 1984 UTM coordinate system zone 34N (EPSG code: 32634). Data from profiles perpendicular to the shore were used for analysis due to the smaller number of points-5 million, while data from parallel profiles contain more than 11 million points. The data from the parallel profiles were left for possible measurement control. Figure 4 presents raw bathymetric data and designed profiles in the HYPACK software. The spatial resolution for these observations is 0.5 × 0.5 m. It should be noted here that Figure 4 is only a preview illustration. The target interpolation surfaces analyzed in the research have different resolutions.  Data were measured from seven measurement profiles in total, comprising two separate measurement missions; two types of profiles were created-one mission for profiles parallel to the shore and one mission for profiles perpendicular to the shore. All data were surveyed in the WGS 1984 UTM coordinate system zone 34N (EPSG code: 32634). Data from profiles perpendicular to the shore were used for analysis due to the smaller number of points-5 million, while data from parallel profiles contain more than 11 million points. The data from the parallel profiles were left for possible measurement control. Figure 4 presents raw bathymetric data and designed profiles in the HYPACK software. The spatial resolution for these observations is 0.5 × 0.5 m. It should be noted here that Figure 4 is only a preview illustration. The target interpolation surfaces analyzed in the research have different resolutions.   Data were measured from seven measurement profiles in total, comprising two separate measurement missions; two types of profiles were created-one mission for profiles parallel to the shore and one mission for profiles perpendicular to the shore. All data were surveyed in the WGS 1984 UTM coordinate system zone 34N (EPSG code: 32634). Data from profiles perpendicular to the shore were used for analysis due to the smaller number of points-5 million, while data from parallel profiles contain more than 11 million points. The data from the parallel profiles were left for possible measurement control. Figure 4 presents raw bathymetric data and designed profiles in the HYPACK software. The spatial resolution for these observations is 0.5 × 0.5 m. It should be noted here that Figure 4 is only a preview illustration. The target interpolation surfaces analyzed in the research have different resolutions.      6 show histograms for raw data with general statistics. The count of points has decreased because ArcMap has removed overlapping points by default. The histogram provides an opportunity to describe the variable. It is a graphical representation of the distribution of the data, i.e., the graph shows how often the observed values fall into predetermined intervals or classes. The horizontal axis of the histogram indicates the specific interval in which the values are combined, while the vertical axis represents the number of points (observations) in these intervals.   6 show histograms for raw data with general statistics. The count of points has decreased because ArcMap has removed overlapping points by default. The histogram provides an opportunity to describe the variable. It is a graphical representation of the distribution of the data, i.e., the graph shows how often the observed values fall into predetermined intervals or classes. The horizontal axis of the histogram indicates the specific interval in which the values are combined, while the vertical axis represents the number of points (observations) in these intervals.

Interpolation Methods
In this work, we decided to also use selected deterministic methods, apart from the geostatistical methods. Deterministic methods are often compared with geostatistical methods to indicate whether geostatistical methods perform better for spatial data, e.g. [18][19][20]23]. Deterministic methods that were used for interpolation were Inverse Distance Weighted (IDW), Radial Basis Function (RBF) and Global Polynomial Interpolation (GPI).

Interpolation Methods
In this work, we decided to also use selected deterministic methods, apart from the geostatistical methods. Deterministic methods are often compared with geostatistical methods to indicate whether geostatistical methods perform better for spatial data, e.g. [18][19][20]23]. Deterministic methods that were used for interpolation were Inverse Distance Weighted (IDW), Radial Basis Function (RBF) and Global Polynomial Interpolation (GPI). When it comes to geostatistical methods, a few kinds of Kriging were used: ordinary Kriging (OK), simple Kriging (SK), universal Kriging (UK), disjunctive Kriging (DK) and empirical Bayesian Kriging (EBK). Estimates of interpolation surfaces were made using ArcGIS software with the Geostatistical Analyst extension.

Inverse Distance Weighted
Inverse Distance Weighted is a deterministic estimation method in which values at unmeasured points are determined by a linear combination of values at nearby measured points [18,30,31]. The main assumption is that the values of the weights are inversely proportional to the powers of the distance between the interpolation points and the measurement point [32]. The IDW function should be used when the set of points is dense enough to capture the range of local surface variation needed for analysis. If the sampling of input points is sparse or uneven, the results may not adequately represent the desired surface area [33]. The advantage of the inverse distance method is its simplicity and efficiency. In addition, IDW can handle extreme terrain changes, such as faults. Disadvantages include sensitivity to outliers-the average cannot be greater than the highest input value nor lower than the lowest. This makes the method unsuitable for creating ridges and valleys. IDW will also not work well for interpolating mountainous areas. The formula for the IDW method is presented below in Equation (1) [31,34]: where: Z j -interpolated value in point j, z i -measured values in points I (i = 1, . . . , n), h ij -distance between points i and j, s-smoothing factor (s > 0), p-weight exponent. The value of the weight exponent is determined by minimizing the RMSE [33].

Radial Basis Function
Radial Basis Function methods are exact interpolation techniques [35]. This means that the interpolation surface must pass through every point where the value is known. The following are five basis functions: thin plate spline (TPS), spline with tension (ST), completely regularized spline (CRS), multiquadric (MQ) (that were used in prediction) and inverse multiquadric (IMQ) [36]. The estimated values are based on a mathematical function that minimizes the total curvature. This causes the resulting surfaces to be smooth. Unlike IDW, RBF can determine values above the maximum and below the minimum of the measured values [17]. RBF gives good results for slightly varying surfaces, such as terrain with small differences in elevation. However, they will not be suitable for large differences in values or for data with large measurement uncertainty [2]. An important advantage of RBF is that it can be used for large data sets (like bathymetric data). The mathematical Equation (2) is given by [36]: where: f (x)-process of the function, ϕ(d)-radial base function, d j -distance between the points of sampling and the predicted point x.
All equations for the previously mentioned five basis functions are given below in Equations (3)-(7) [37]: where: d-distance from sample to prediction location, c-smoothing factor, I 0 ()-modified Bessel function, E-Euler's constant.

Global Polynomial Interpolation
The Global Polynomial Interpolation method involves fitting a smooth surface, defined by a mathematical function (polynomial), to the data points of the input [22]. The method of global polynomial interpolation is classified as an inaccurate interpolator because the mathematical function rarely passes through all the actual measured points. Calculated surfaces are highly susceptible to outliers (extremely high and low values), especially at the edges [33]. This method can be used for trend surface analysis but will also work well in cases where the surface varies slowly from region to region.

Kriging
The Kriging method is often referred to by the acronym B.L.U.E. (best linear unbiased estimator). Kriging is linear because its estimates are linear, weighted, and combinations of data [31]. The unbiasedness is due to the fact that the goal of Kriging is to obtain a mean error equal to 0, while "best" refers to the minimization of error variation. This minimization of error variation distinguishes Kriging from other methods. Equation (8) shows an estimation of unknown values at different points using different sets of weights [34]:ẑ where: z(x 0 )-the unknown value at any point, w i -sets of weights, z(x i )-the value of the measured quantity at neighboring points.
There are a few Kriging methods (or types of Kriging methods), as was said at the beginning of this part. Simple Kriging assumes that the average is known and constant throughout the area. In ordinary Kriging, the average is treated as an unknown but fixed value. Universal Kriging assumes that the unknown local average changes gradually over the study area [38]. Disjunctive Kriging represents a form of nonlinear kriging, which in general offers an improvement over linear kriging methods yet does not require knowledge of the n = 1 joint probability distributions necessary for the conditional expectation [37]. Disjunctive Kriging mainly uses Hermite polynomials and Gaussian bivariate assumptions.
It is not a very popular type of Kriging. Empirical Bayesian Kriging differs from classical Kriging methods. EBK considers the error introduced by the estimation of the semivariogram model. This is done by estimating, and then using, many semivariogram models rather than a single semivariogram. This process entails the following steps: Using this semivariogram, a new value is simulated at each of the input data locations. • A new semivariogram model is estimated from the simulated data. Weight for this semivariogram is then calculated using Bayes' rule, which shows how likely the observed data can be generated from the semivariogram.
Steps 2 and 3 are repeated. With each repetition, the semivariogram estimated in Step 1 is used to simulate a new set of values at the input locations. The simulated data is used to estimate a new semivariogram model and its weight. Predictions and prediction standard errors are then produced at the unsampled locations using these weights [39].

Validation Methods
Before a final interpolation surface is created, it is necessary to verify how well the model predicts values at unknown locations. This is what cross-validation will be used for. Cross-validation uses all data to estimate models. It removes each data location one at a time and predicts the associated data value using the points left behind [37]. The quality of the comparison of results between several estimations can be assessed using estimation quality statistics. These include the following [8,38,40,41]: Mean error (ME); 2.
All statistics are available for geostatistical methods. For deterministic methods, only ME and RMSE can be calculated. Optimally, the values of ME and MSE should be as close to zero as possible [8,40,42,43]. RMSE is useful for comparing interpolation results from different methods; it should be as small as possible. If the ASE is greater than the RMSE, then this indicates that the variability in the dataset has been overestimated; on the contrary, it means that the variability has been underestimated. As for the RMSSE, it should reach a value close to 1 [8,42,43].
Quality analysis of the created interpolation methods will consist of comparing the statistics obtained with their benchmark values, as described above. The approach of creating a smaller test dataset and calculating the estimated values from the interpolation surfaces was also used in comparing the methods. These values were then compared with the measured values at these points. Then, from the estimated points, the mean error or standard deviation can be calculated. Using the RMSE, the sums of similar estimated and measured depth values, calculated mean errors and standard deviations from the test set, a ranking of all methods was created, determining which methods proved to be the best interpolators.
To compare the methods comprehensively, we propose a ranking in which we analyze jointly the above criteria to get a complex response from the research. The ranking is a kind of "averaging" of all the previously mentioned factors, since each of them presents slightly different dependencies, and going by them individually does not make it possible to say unequivocally which method is better than the others. In addition, the ranking will allow verification of the pattern of these statistics-that is, whether one method will be the best or the worst in all statistics, whether half will be the best and the other half the worst, and how this will relate to the final ranking determination. It may turn out here that the best method will be the one for which the statistics were arranged in the middle of the rate of all methods. In the proposed ranking, each method is given a note based on its results calculated with a particular factor. The best method gets 1 and the worst method gets 8, and all the others get numbers sequentially. Finally, the average value for all factors is calculated. The ranking is presented at the end of the analysis in Section 3.2.5.

Initial Data Processing and Analysis
The initial data processing and analysis are essential before proceeding with the interpolation. In this stage, all data information (spatial and non-spatial) was verified and thoroughly analyzed, which is needed to create the best possible models. Without such checking, interpolation results may be unsatisfactory, and it can be difficult to verify the causes of that. This is also the part in which study limitations related to data take place. These limitations are caused by data acquisition and processing characteristics as well as software requirements, which forced a few assumptions that should be taken into account during data and result analysis. First of all, for the practical reasons described in Section 3.1.1, we assumed that a significant reduction of data is needed prior to further processing. We also decided to remove the outliers, as they significantly affect geostatistical calculations. Additionally, it has to be noted that the study was performed on inland water data, which are considered to be shallow waters in general-this affects the proper analysis of results as the accuracy expected is much higher than in the case of deep waters.

Data Reduction
Before the ESDA could be done, the data had to be reduced. A million-point dataset is not representative (many points overlap), which is important in geostatistical analysis. It is recommended to perform geostatistical analysis on a maximum of 500 points-in the case of larger datasets, the results can be unreliable. The reduction was done on data from orthogonal profiles. The tool used for this was AcrGIS Reduce Point Density, in which the radius of thinning was set at 6.5 m. As a result, the number of points was cut from 5,182,027 to 519. The next step was removing the outliers that could affect the interpolation process. The final reduced layer consists of 494 points (Figure 7). It can be noticed, that Figure 7 shows a gap in the raw data. The gaps can occur in cases of temporary failure of the measurement system (e.g., weak connection with an unmanned vessel). Therefore, in a practical approach, the surveys are conducted at least twice to control the results and ensure full coverage. It was assumed that this gap would not affect the resulting interpolation strongly because there is a large neighborhood of depth estimation points around it.

Exploratory Spatial Data Analysis
The reduced data could be exploratorily analyzed. The goal of this part is to get knowledge about the nature of the dataset, which allows for adjusting better the parameters of geostatistical analysis. It consists in fact of the following steps:
Trend analysis.
Firstly, the symbolization of points was changed, dividing them into five depth classes. This allows a general overview of the structure of the data to learn how the depth of the analyzed area is distributed. Figure 8 shows the distribution of depth values in meters.
This step showed that points next to each other are more like each other than those further away, as points in the same depth range formed clusters. Then a histogram analysis was performed (Figure 9). Sensors 2023, 23, x FOR PEER REVIEW 12 of 32 Figure 7. The process of data reduction. Figure 7. The process of data reduction.
Firstly, the symbolization of points was changed, dividing them into five ses. This allows a general overview of the structure of the data to learn how t the analyzed area is distributed. Figure 8 shows the distribution of depth value This step showed that points next to each other are more like each other further away, as points in the same depth range formed clusters. Then a histo ysis was performed (Figure 9). The created histogram shows a leftward asymmetry of distribution (so the skewness coefficient probably has a negative value). This is evidenced by the fact that, in front of the cluster of bars on the left, there are a few data points whose values deviate significantly from the others. The outliers' values and locations were checked. As assumed, these are the outlier points with the deepest values (exceeding 25 m). Checking locations is quite important, as their distribution can significantly affect the resulting interpolation surfaces. Figure 10 shows that outliers are concentrated in one place. The created histogram shows a leftward asymmetry of distribution (so the skewness coefficient probably has a negative value). This is evidenced by the fact that, in front of the cluster of bars on the left, there are a few data points whose values deviate significantly from the others. The outliers' values and locations were checked. As assumed, these are the outlier points with the deepest values (exceeding 25 m). Checking locations is quite important, as their distribution can significantly affect the resulting interpolation surfaces. Figure 10 shows that outliers are concentrated in one place. coefficient probably has a negative value). This is evidenced by the fact that, in front of the cluster of bars on the left, there are a few data points whose values deviate significantly from the others. The outliers' values and locations were checked. As assumed, these are the outlier points with the deepest values (exceeding 25 m). Checking locations is quite important, as their distribution can significantly affect the resulting interpolation surfaces. Figure 10 shows that outliers are concentrated in one place. Then, we compared this Figure with the points outside the profiles. As a control, the depth values of the points for parallel profiles were checked. It turned out that the depth at this location for parallel profiles was in the range of −6-−7 m, which is much less than 25 m. Thus, it can be considered that these are errors, so they were removed. After removing the histogram, it was checked again. Removal of the erroneous points resulted in the histogram now being positively skewed ( Figure 11). The statistics were also checked. Then, we compared this Figure with the points outside the profiles. As a control, the depth values of the points for parallel profiles were checked. It turned out that the depth at this location for parallel profiles was in the range of −6-−7 m, which is much less than 25 m. Thus, it can be considered that these are errors, so they were removed. After removing the histogram, it was checked again. Removal of the erroneous points resulted in the histogram now being positively skewed ( Figure 11). The statistics were also checked. The next tool used in this part was the Normal Q-Q Plot to test whether the dat a normal distribution (Figure 12). The next tool used in this part was the Normal Q-Q Plot to test whether the data has a normal distribution (Figure 12). The next tool used in this part was the Normal Q-Q Plot to test whether the data has a normal distribution ( Figure 12). The graph shows that, except for outliers (the deepest and shallowest points), the data tends toward a normal distribution, as most of the data is very close to the baseline (line on 45° angle). This means that spatial data does not have a normal distribution but tends towards it. The normality of the distribution can be disturbed by, among other things, the presence of a trend, which is checked in the next step ( Figure 13). Trend analysis allows us to determine the presence or absence of a trend in the input dataset and to determine which order of the polynomial best matches the trend. The graph shows that, except for outliers (the deepest and shallowest points), the data tends toward a normal distribution, as most of the data is very close to the baseline (line on 45 • angle). This means that spatial data does not have a normal distribution but tends towards it. The normality of the distribution can be disturbed by, among other things, the presence of a trend, which is checked in the next step ( Figure 13). Trend analysis allows us to determine the presence or absence of a trend in the input dataset and to determine which order of the polynomial best matches the trend. From the trend graph, it was clear that a trend is occurring-the green and blue lines are rounding into the shape of a parabola, which indicates a trend of the second degree. Therefore, for further analysis, it was decided that such trends must be included in further modeling. From the trend graph, it was clear that a trend is occurring-the green and blue lines are rounding into the shape of a parabola, which indicates a trend of the second degree. Therefore, for further analysis, it was decided that such trends must be included in further modeling.

Structural Analysis and Variogram Modeling
For structural analysis, a semivariogram cloud was created. According to the theory (Tobler's 1st law of geography) [44,45], the farther the points are on the x-axis (that is, pairs of points are farther apart), the farther these points should be from 0 on the y-axis. Analysis of the semivariogram cloud showed that as the distance increases, there are fewer and fewer pairs of points that have strong similarities between them ( Figure 14).

Structural Analysis and Variogram Modeling
For structural analysis, a semivariogram cloud was created. According to the theory (Tobler's 1st law of geography) [44,45], the farther the points are on the x-axis (that is, pairs of points are farther apart), the farther these points should be from 0 on the y-axis. Analysis of the semivariogram cloud showed that as the distance increases, there are fewer and fewer pairs of points that have strong similarities between them ( Figure 14). The trend in the data also determines the occurrence of anisotropy, as the data may not only depend on the distance but also on the direction. For comparison, the clouds were checked for angles direction 0°, 45°, 90° and 135°, i.e., for every possible direction ( Figure  15a and 15b). The trend in the data also determines the occurrence of anisotropy, as the data may not only depend on the distance but also on the direction. For comparison, the clouds were checked for angles direction 0 • , 45 • , 90 • and 135 • , i.e., for every possible direction (Figure 15a,b).  Figure 15 show that in the directions 0°, 45° and 90°, the greater the distance, the fewer similar pairs of points there are (although at 0° the number of pairs for most of the graph is similar). The situation is different with a directional angle of 135°-here, with increasing distance, the number of similar pairs also increases. The cloud of the semivariogram at 135° confirms the presence of a trend in the data-the semivariance function increases with further distances. The different level of variation in different directions  Figure 15 show that in the directions 0 • , 45 • and 90 • , the greater the distance, the fewer similar pairs of points there are (although at 0 • the number of pairs for most of the graph is similar). The situation is different with a directional angle of 135 • -here, with increasing distance, the number of similar pairs also increases. The cloud of the semivariogram at 135 • confirms the presence of a trend in the data-the semivariance function increases with further distances. The different level of variation in different directions indicates the presence of spherical anisotropy, so both trend and anisotropy were considered in the modeling variation and estimation stage. Figure 16 shows the variogram model used for estimation. The determinants of this model were previously studied: trend and anisotropy, as well as the lag size. The selection of a lag size has important effects on the empirical semivariogram. For example, if the lag size is too large, short-range autocorrelation may be masked. If the lag size is too small, there may be many empty bins, and sample sizes within bins will be too small to get representative averages for bins [33]. For the analyzed data, the most optimal lag size is approximately 4.93 m. The experimental variogram followed an exponential model, as shown in Figure 16. The function reaches about 95% of the sill value at a distance equal to the range.
increases with further distances. The different level of variation in different directions indicates the presence of spherical anisotropy, so both trend and anisotropy were considered in the modeling variation and estimation stage. Figure 16 shows the variogram model used for estimation. The determinants of this model were previously studied: trend and anisotropy, as well as the lag size. The selection of a lag size has important effects on the empirical semivariogram. For example, if the lag size is too large, short-range autocorrelation may be masked. If the lag size is too small, there may be many empty bins, and sample sizes within bins will be too small to get representative averages for bins [33]. For the analyzed data, the most optimal lag size is approximately 4.93 m. The experimental variogram followed an exponential model, as shown in Figure 16. The function reaches about 95% of the sill value at a distance equal to the range.

Estimation of Interpolation Surfaces
Implementation of interpolation begins with indicating the target method and dataset (and the attribute of data against which the estimation will be performed). Then follows an optimization of general settings (in the case of deterministic methods) such as type of neighborhood, checking the weights of the points or power of the fit. In the case of geostatistical methods, the next step is modeling the variogram (which is described in Section 3.1.3). Next, cross-validation is performed-all error values and target predicted values are estimated. The final step is creating interpolation maps, which can be visualized and analyzed. Figure 17 presents the created interpolation surfaces for eight methods used. The depth values are in meters. In general, the surfaces presented in the figure are more or less similar. However, deeper visual analysis can lead to important details. The surfaces differ mainly in aspects such as the smoothness of the surface, noticeable in the areas where depth significantly changes; the flatness of the area, noticeable as the size of the medium depth area in the middle and extreme areas in the corners; and the size of local differences, e.g., flattening in the deep area. Taking geostatistical methods into account, it can be seen that UK and EBK give the most varied surfaces. In the UK, the sizes of similar depth areas are rough and the shape of flat areas in the deep part is significantly different than in other methods. This might be the result of the assumption that the unknown local average changes gradually over the study area, which affects global smoothness. EBK, on the other hand, gives the most varied surface in terms of depth changes. Extreme values are boosted, while the medium value area is very small. The gradient is the highest in this method. Thus, the flattening area almost disappeared, and on the other hand, the size of the deepest values area and the size of the smallest depths area is the biggest on all surfaces. This is a result of the local adjustment of the semivariogram to the data within the iterated methodology. Other Kriging examples give relatively similar

Estimation of Interpolation Surfaces
Implementation of interpolation begins with indicating the target method and dataset (and the attribute of data against which the estimation will be performed). Then follows an optimization of general settings (in the case of deterministic methods) such as type of neighborhood, checking the weights of the points or power of the fit. In the case of geostatistical methods, the next step is modeling the variogram (which is described in Section 3.1.3). Next, cross-validation is performed-all error values and target predicted values are estimated. The final step is creating interpolation maps, which can be visualized and analyzed. Figure 17 presents the created interpolation surfaces for eight methods used. The depth values are in meters. In general, the surfaces presented in the figure are more or less similar. However, deeper visual analysis can lead to important details. The surfaces differ mainly in aspects such as the smoothness of the surface, noticeable in the areas where depth significantly changes; the flatness of the area, noticeable as the size of the medium depth area in the middle and extreme areas in the corners; and the size of local differences, e.g., flattening in the deep area. Taking geostatistical methods into account, it can be seen that UK and EBK give the most varied surfaces. In the UK, the sizes of similar depth areas are rough and the shape of flat areas in the deep part is significantly different than in other methods. This might be the result of the assumption that the unknown local average changes gradually over the study area, which affects global smoothness. EBK, on the other hand, gives the most varied surface in terms of depth changes. Extreme values are boosted, while the medium value area is very small. The gradient is the highest in this method. Thus, the flattening area almost disappeared, and on the other hand, the size of the deepest values area and the size of the smallest depths area is the biggest on all surfaces. This is a result of the local adjustment of the semivariogram to the data within the iterated methodology. Other Kriging examples give relatively similar results; however, DK calculates the most smoothed surface (with the smallest flat area in the deep part) and OK, the least smoothed, but not as rough as the UK method.
Comparison to other methods shows that the geostatistical approach, in general, provides a nice compromise between surface smoothness and nice visual presentation and model accuracy. For example, in the GPI method, the surface is very smooth, but in many places, important changes are lost, like flattening in the deep part. The shape of the depth areas is significantly different than in other methods. On the other hand, the IDW method interpolates values only locally, which results in a good reflection of depth changes; however, the surface is very rough. RBF gives the most similar results to geostatistical methods but is slightly more sensitive to data changes. results; however, DK calculates the most smoothed surface (with the smallest flat area in the deep part) and OK, the least smoothed, but not as rough as the UK method. Comparison to other methods shows that the geostatistical approach, in general, provides a nice compromise between surface smoothness and nice visual presentation and model accuracy. For example, in the GPI method, the surface is very smooth, but in many places, important changes are lost, like flattening in the deep part. The shape of the depth areas is significantly different than in other methods. On the other hand, the IDW method interpolates values only locally, which results in a good reflection of depth changes; however, the surface is very rough. RBF gives the most similar results to geostatistical methods but is slightly more sensitive to data changes.

Analysis of Results
Interpolation was the final step of geostatistical modeling. Additionally, a deterministic model was created (Figure 14) for comparison. The models were compared according to the verification methodology described in Section 2.4. The plan for the analysis is outlined as follows:

•
Interpolation statistics analysis; • Creating a test point dataset; • Comparison of measured and estimated depth values (based on the test dataset); • Calculation of differences between measured and estimated depth values in the test dataset (mean error, standard deviation); • Ranking methods.
The rules for interpretation and the explanation of their meaning for the accuracy analysis are given in Section 2.4; case analysis of values in this section follows the same rules.

Estimation of Interpolation Surfaces
The metrics used for statistical analysis are presented in Section 2.4. In this chapter, we will show these values for interpolation performed by deterministic methods and by

Analysis of Results
Interpolation was the final step of geostatistical modeling. Additionally, a deterministic model was created (Figure 14) for comparison. The models were compared according to the verification methodology described in Section 2.4. The plan for the analysis is outlined as follows:

•
Interpolation statistics analysis; • Creating a test point dataset; • Comparison of measured and estimated depth values (based on the test dataset); • Calculation of differences between measured and estimated depth values in the test dataset (mean error, standard deviation); • Ranking methods.
The rules for interpretation and the explanation of their meaning for the accuracy analysis are given in Section 2.4; case analysis of values in this section follows the same rules.

Estimation of Interpolation Surfaces
The metrics used for statistical analysis are presented in Section 2.4. In this chapter, we will show these values for interpolation performed by deterministic methods and by Kriging in various variants. Table 1 shows ME and RMSE, which are available for all the methods. Table 2 presents only the interpolation statistics for the geostatistic methods. All values in Tables 1 and 2 are in meters.  Based on the results, the model values were compared with the following observations: 10. In each case, ME is close to 0; 11. The MSE value is also close to 0; 12. Only in the EBK method, the value of ASE is close to RMSE; in the remaining Kriging methods, the ASE is smaller than the RMSE, which may indicate an underestimation of data variability, 13. RMSSE in the EBF method is almost 1, which is almost ideal; in the other cases it exceeds the value of 2, which may indicate an overestimation of data variability, 14. Based on RMSE, the ranking of methods is EBK, RBF, DK, OK, SK, IDW, UK, GPI, where GPI clearly stands apart from the others (values 1.7 where other methods have values 1.2-1.3).

Creating a Test Point Dataset
A new class of objects was created with 40 points for interpolation efficiency comparison. An effort was made to indicate points located at different depths and those at the boundaries between two depth classes ( Figure 18). This goal might not be achieved using the uniform spatial sampling method. In addition, the method used meets important objectives: equal spatial coverage and equal coverage in feature space. For the next two analyses, the estimated values for these 40 points are needed. The GA Layer to Points tool was run to calculate depth values for these points for each created interpolation surface. The result was eight new layers with estimated depth values for the test area. From these layers, values were extracted, and then, target comparisons and cal-  For the next two analyses, the estimated values for these 40 points are needed. The GA Layer to Points tool was run to calculate depth values for these points for each created interpolation surface. The result was eight new layers with estimated depth values for the test area. From these layers, values were extracted, and then, target comparisons and calculations were made. Table 3 shows the values of measured and estimated depth for each used method. All values are in meters. Closest to the estimated values are in bold in the Table. The last row of Table shows the number of the best values for each method. This analysis showed that the most similar estimated values measured is in the IDW method. The next places are followed by EBK, DK, RBF, UK, SK, GPI and OK in last place. The second thing noted is the fact that in this analysis we cannot say if deterministic or geostatistical methods are better interpolators, because there is no significant advantage of one method over the other.

Calculation of Differences between Measured and Estimated Depth Values in the Test Dataset
In the next analysis, the differences between measured and estimated values were calculated. Then, the mean error and standard deviation were calculated. The results are presented in Table 4. Bolded values are the smallest differences between measured and estimated depth values. All values are in meters.  Table 4 shows that the GPI method could create the best interpolation since the mean absolute error is 0.20 m. However, GPI has the highest standard deviation, so the measurement results here are the furthest from the mean, and that fact shatters the suggestion that GPI can be the best interpolator method. A large deviation was also observed for the IDW; additionally, it has the highest average measurement error, which is 0.30 m. The lowest standard deviation, on the other hand, was calculated for the UK method. Generally, in this case, we may conclude that geostatistical methods gave better results because all the Kriging methods have the lowest standard deviation and besides GPI, the lowest mean absolute errors than deterministic methods.

Methods Ranking
The final part of the analysis was preparing a ranking of methods, based on own proposed methodology. The ranking was determined based on the following factors: 15. RMSE; 16. The sum of most similar observed values between measured and estimated; 17. Mean absolute error; 18. Standard deviation.
A summary of the parameters that determine the realization of the best interpolation surface is presented in Table 5. Each method was ranked from 1 to 8, where 1 is the smallest RMSE, mean absolute error, standard deviation, and the biggest number of most similar observed values between measured and estimated. Next, all the ranking notes were averaged, and that average decided the final ranking. The ranking is: Empirical Bayesian Kriging (EBK) 3.
Global polynomial interpolation (GPI) As we can see, the summary of the analysis clearly shows that geostatistical methods are better interpolators for spatial (bathymetric) data.

Verification of the Results with Other Data
The analysis was performed once again in another dataset for verification of results. The dataset was collected in the port of Szczecin, the capital city in Zachodniopomorskie Voivodeship ( Figure 19). Like previous steps, the data were reduced (from 634,913 to 467), and the ESDA and structural analysis were performed, then making estimations and analysis of the results.  What is noteworthy is the fact that the EBK method is the only one that does not interpolate the entire area but only a range of data. Visually, we can also see that IDW and OK interpolated different depth values in the middle of the area than the rest of the methods, which basically confirms earlier findings in Figure 17. A similarity of interpolation can be seen on the right side-the region marked in red indicates deeper values. The visual analysis also indicates that the interpolation of the land is due to the neighboring survey points to the land. Of course, the interpolation of land is wrong, because ArcGIS is not able to define on its own what specific area it interpolates, while the researcher himself is aware of this and analyzes only the area of interest (in this case, the water part).  What is noteworthy is the fact that the EBK method is the only one that does not interpolate the entire area but only a range of data. Visually, we can also see that IDW and OK interpolated different depth values in the middle of the area than the rest of the methods, which basically confirms earlier findings in Figure 17. A similarity of interpolation can be seen on the right side-the region marked in red indicates deeper values. The visual analysis also indicates that the interpolation of the land is due to the neighboring survey points to the land. Of course, the interpolation of land is wrong, because ArcGIS is not able to define on its own what specific area it interpolates, while the researcher himself is aware of this and analyzes only the area of interest (in this case, the water part). Table 6 shows the results of the analysis including searching most similar points and calculations of mean absolute error and standard deviation for each method. The analysis of Table 6 shows that, in general, this surface was a harder case to be modeled, as in all cases mean absolute error achieved is higher. This is the result of the land area between the measurement point, which affects interpolation in most cases (all except EBK). Thus, EBK and RBF give good results as the methods adjusted to sudden changes (like between water and land area). Surprisingly good results were obtained with Simple Kriging, which has a significantly bigger number of similar values than the other. The area itself is rather flat, without major gradients of depths (apart from berths), therefore simple model fits well with the data.  Table 6 shows the results of the analysis including searching most similar p calculations of mean absolute error and standard deviation for each method. The analysis of Table 6 shows that, in general, this surface was a harder c modeled, as in all cases mean absolute error achieved is higher. This is the res land area between the measurement point, which affects interpolation in most except EBK). Thus, EBK and RBF give good results as the methods adjusted t changes (like between water and land area). Surprisingly good results were obta Simple Kriging, which has a significantly bigger number of similar values than The final ranking is presented in Table 7. Analogies and differences that were observed between the main analysis and the verification analysis are listed below: 19. In both cases, the worst methods are IDW and GPI (last place in both final rankings); 20. RBF (Multiquadric) in the first analysis was one of the worst methods, in the second one is in the second place; 21. In both analyses, the best Kriging methods are Simple, Disjunctive and Empirical Bayesian-in the first case, these methods reached in the ranking respectively third, first and second place, in the second one SK was the best and DK and EBK reached the third place; 22. In both cases, the deterministic methods have high standard deviations (especially IDW and GPI).
In general verification research showed that undertaken methodology for interpolation methods assessment is appropriate. RMSE, MAE and standard deviation show qualitative assessment. They are in a large number of cases related to each other. To make the assessment more complete, we have proposed also the sum of the most similar factors, which reflects the quantitative assessment by showing similarity to selected measured data points. Averaging these factors allows us to provide one simple indicator that includes these elements; however, keeping the accuracy (reflected by errors) is the most important.

Conclusions
The work presents interpolation methods that were subjected to comparative analysis using bathymetric data. In addition to geostatistical methods, which are the main target of the work's consideration, three deterministic methods were also included in the comparison, assuming that the results of these methods would be inferior to the geostatistical ones. The comparison and assessment of interpolation methods were performed with parameters such as the MAE, standard deviation, and RMSE. However, additionally, a ranking of methods based on the integration of the above parameters was provided. A ranking itself was proposed as an average of statistical factors reflecting an error, similarity, and dispersion. It is, therefore, an attempt to provide a comprehensive assessment in a relatively simple approach. In the future, modifications can be proposed to weigh the factors in other ways or to introduce other indicators.
Based on the results, it can be concluded that more accurate prediction maps were obtained using geostatistical methods (various types of Kriging methods) than deterministic methods. However, the radial basis function does not at all strongly deviate in results from Kriging methods, unlike inverse distance weighted and global polynomial interpolation. Moreover, sometimes it can be better than geostatistical methods. This confirms what was previously established, that RBF is suitable for working with large data sets. The best results were achieved with DK and EBK. Interestingly, those methods are not classical types of Kriging. EBK does not need to consider either anisotropy or trend, which were observed during ESDA spatial and structural analysis, so this raises the question of what is more important for Kriging interpolation-the determination of trend and anisotropy or creating a semivariogram model based on multiple calculations of this function. DK also differs from the classical approaches of the Kriging method, as it is a non-linear type of this method. Nevertheless, for each method, a rather large average error of measurement (range of 0.20-0.30 m for the main analysis and range of 0.30-0.65 m for the verification analysis) was observed. For cartographic purposes, such an error will not be too problematic, but for precise measurements, such values may prove to be too large. However, it should be noted here that the largest errors arise at the extreme locations, where there are a few neighboring points. The second problem may be related to the reliability of the data-during the ESDA method, several points were captured where the measured values were erroneous. Another issue that should be investigated in the future is the effect of depth on the results-the study described in this paper took place in a fairly shallow and uniform body of water. The next step would be to test a more varied body of water and contrast the results of the uniform bottom analysis with the diverse one. Performing analyses on a wide variety of data will allow a better understanding of the studied data, which is the basis for conducting geostatistical analyses. Further research should also include the analysis of the computational efficiency of interpolation algorithms, taking into account possible implementation in practical applications for bathymetric data modeling.