3.1. Classified Benthnic Zone Mapping
Since its initial release, BTM has supported the generation of seafloor classification maps using only a bathymetric surface and a classification dictionary. Traditionally, the bathymetric surface was used to generate two grids of BPI (at two different scales) and a grid of slope prior to feeding these intermediate artifacts into the Classify Benthic Terrain tool, in which a classification dictionary was manually created to generate a map of benthic zones for the study area, or stored as XML. The newest release of BTM includes the Run All Model Steps tool, which condenses this workflow into a single step, and accepts both CSV and Excel files as classification dictionaries.
To demonstrate BTM’s classification mapping capabilities, a subset of the LiDAR bathymetry dataset shown in Figure 7
was classified using the Run All Model Steps
tool. The original dataset was clipped to obtain a smaller study area that highlights the benthic structures immediately surrounding Buck Island.
Typical users of BTM should carefully build a unique classification dictionary for each study area by considering the context of the input data and the goals of the analysis. For each benthic zone or habitat in the table, upper and lower bounds of broad scale BPI, fine scale BPI, slope, and depth should be chosen by considering some or all of the following:
Scale and resolution of the input data
Scale of focal operations used to calculate BPI and slope
Previous studies of the area of interest
Typical values observed for the benthic zone of interest
The range of values in each classification artifact
There is no universally applicable approach to creating a classification dictionary for use in BTM. In each case, users will need to utilize a different combination of the above resources in conjunction with professional judgment and perhaps an iterative refinement of the values used. A more rigorous approach can be obtained by using a generalized linear model (GLM) to extract the key variables as in Dunn et al. [15
However, the purpose of this example is not to provide an accurate mapping of benthic zones in BINMR, but rather to demonstrate the typical steps that would be taken within BTM to accomplish that goal. In that context, the classification dictionary shown in Table 3
was created with an alternative approach to that recommended above.
Two BPI grids (broad and fine scale) and a slope grid were created using the clipped bathymetry dataset and the parameters listed in Table 4
. Each grid was then overlayed with polygon features, each representing a benthic zone described in a 2012 benthic habitat map of the study area produced by the NOAA/NOS/NCCOS/CCMA Biogeography Branch [61
]. Using the Zonal Statistics as Table tool provided with ArcGIS, the maximum and minimum values of each grid within each polygon were summarized into the classification dictionary and then revised. This method is only relevant for creating a classification dictionary to demonstrate software functionality and should not be used otherwise.
The Run All Model Steps
tool was then executed in ArcGIS using the parameters summarized in Table 4
. The result is shown in Figure 8
3.2. Integrating R Statistical Analysis
The ease of calculating BTM covariates is further enhanced by the ability to utilize them in predictive models or other analyses. The direct link between ArcGIS and the statistical programming language R, allows for the direct transfer of data to either software and the result of which is ready for immediate use. This functionality enables the creation of maps, the aggregation and wrangling of data, the calculation of covariates, the examination of the relationships between covariates, the building of predictive models, the analysis of diagnostic measures and charts, and more. Ultimately, the bridge enables the ability to utilize the needed tools or functions to answer the questions at hand and to create efficient and reproducible workflows. To demonstrate this, the bridge will be used to transfer BTM data to R to consider a dimensionality reduction method and to analyze the results.
After all desired BTM covariates have been derived, simultaneously working in ArcGIS and R with the same data is possible using the R-ArcGIS bridge [66
]. The R-ArcGIS bridge consists of the R package, arcgisbinding
, which provides functions for reading, writing, converting, and manipulating spatial data between ArcGIS and R. The advantages of the arcgisbinding
package compared to packages like rgdal
are most noticeable when considering the breath of data that can be transferred and when coordinated data manipulation is needed. For example, the package can read and write to any data source that exists within ArcGIS. This includes vector data stored in formats such as shapefiles, file geodatabase, or a URL for a feature service, and any supported raster data types, including complex types like mosaic datasets. Additionally, in the arcgisbinding
package, the same functions used to read in data, can also be used to perform actions like creating custom subsets and selections, reprojecting both vector and raster data on the fly, and resampling and adjusting the extent of raster data. All of the above mentioned functionality is contained in the functions arc.open
if working with vector data, as shown, or arc.open
if working with raster data.
# Load library containing R-ArcGIS bridge functionality
# Check connection between R and ArcGIS has been established
input_gdb <- “C:/ArcGIS/Projects/BTM/BTM.gdb”
feature_class <- “Field_AAData_ClassifiedLocations”
# Establish pointer to desired data’s stored location
arc_locations <- arc.open(file.path(input_gdb, feature_class))
# Convert from an ArcGIS data type to an R data frame object
df_locations <- arc.select(arc_locations)
# Convert from an R data frame object to a spatial data frame object
# from the R package sp
spdf_locations <- arc.data2sp(df_locations)
Once this data is in its desired format, any of the functions or packages in R can be used. For example, since covariates typically used in Benthic Terrain Modeling are highly correlated, extracting the linear components of each predictor to alleviate multicollinearity prior to modeling, might be of interest. Principal component analysis is one such method for accomplishing this, by quantifying how much variance is explained by each covariate, but also by providing a useful metric for dimension reduction and building predictive models.
# Remove NAs
df_locations <- na.omit(df_locations)
# Creation of training/testing datasets
smp_size <- floor(0.80 * nrow(df_locations))
# Randomly select observation numbers to ensure sample is randomly selected
train_ind <- sample(seq_len(nrow(df_locations)), size = smp_size)
# Subset the original data based on the randomly selected observation
# numbers to make the training data set
df_locations_train <- df_locations[train_ind, ]
# Subset the original data on the remaining observations to make the
# testing data set
df_locations_test <- df_locations[-train_ind, ]
# Make predictor variable into a factor
df_locations_train$D_STRUCT <- as.factor(df_locations_train$D_STRUCT)
# Separate response from the covariates
btm.train_covariates <- df_locations_train[, 2:17]
btm.train_response <- df_locations_train[, 1]
# Apply Principal Component Analysis
btm.train_pca <- prcomp(btm.train_covariates,
center = TRUE,
scale. = TRUE)
Results of the PCA analysis can be explored using functions like summary()
) and plot()
) to determine the proportion of variance explained by each component and which components together are able to explain up to 95 percent of the variance. From this point, results could be used to determine the most influential covariates which could then be used to construct a predictive model and assess model fit.
This example is just one possibility of the functionality between R and ArcGIS. The ease of transferring data back and forth, with the ability to perform a variety of statistical and spatial analyses, coupled with the creation of new covariates and maps allows for in depth exploration of the statistics and the science. Final results can be converted back to ArcGIS data types for the production of final maps or tables and R functionality plots can be integrated into the software through the creation of script tools and models further integrating the two.