Semi-Automated Object-Based Classification of Coral Reef Habitat using Discrete Choice Models

As for terrestrial remote sensing, pixel-based classifiers have traditionally been used to map coral reef habitats. For pixel-based classifiers, habitat assignment is based on the spectral or textural properties of each individual pixel in the scene. More recently, however, object-based classifications, those based on information from a set of contiguous pixels with similar properties, have found favor with the reef mapping community and are starting to be extensively deployed. Object-based classifiers have an advantage over pixel-based in that they are less compromised by the inevitable inhomogeneity in per-pixel spectral response caused, primarily, by variations in water depth. One aspect of the object-based classification workflow is the assignment of each image object to a habitat class on the basis of its spectral, textural, or geometric properties. While a skilled image interpreter can achieve this task accurately through manual editing, full or partial automation is desirable for large-scale reef mapping projects of the magnitude which are useful for marine spatial planning. To this end, this paper trials the use of multinomial logistic discrete choice models to classify coral reef habitats identified through object-based segmentation of satellite imagery. Our results suggest that these models can attain assignment accuracies of about 85%, while also reducing the time needed to produce the map, as compared to manual methods. Limitations of this approach include misclassification of image objects at the interface between some habitat types due to the soft gradation in nature between habitats, the robustness of the segmentation algorithm used, and the selection of a strong training dataset. Finally, due to the probabilistic nature of multinomial logistic models, the analyst can estimate a map of uncertainty associated with the habitat classifications. Quantifying uncertainty is important to the end-user when developing marine spatial planning scenarios and populating spatial models from reef habitat maps.


Introduction
Coral reefs provide important ecosystem services to many coastal communities and small island developing states located in tropical and sub-tropical climes [1][2][3].As a result, many nations are deploying marine spatial planning as a tool to manage the use of coral reef ecosystems in order to achieve ecological, economic, and social objectives.The development of coral reef habitat maps is a fundamental component to marine spatial planning and modeling efforts [4][5][6][7][8][9][10][11][12][13].Habitat maps provide an inventory of habitat types, from which one can understand the range of living marine resources that exist across a coral reef seascape and their location relative to human activities and anthropogenic stressors [14] as well as the evolution of the seascape through time [15][16][17][18].Habitat maps can also serve as input layers to spatially explicit models which can help managers understand the costs and benefits associated with different potential management regimes [19].In many cases, coral reefs encompass large, interconnected landscape-level areas, which must be considered in their entirety when developing marine spatial management plans.Therefore, large, landscape-level habitat maps must be developed at appropriate scales to properly manage these systems [20,21].
Advances in remote sensor technology, data storage, and computational efficiency have made it possible to develop meter-resolution habitat maps across large coral reef areas [22,23].However, the ability to quickly produce such high resolution map products in an automated fashion remains a challenge.This bottleneck largely arises due to the fact that developing a habitat map of the highest possible accuracy typically requires a great deal of manual editing, after applying a traditional supervised or unsupervised classification [24,25].Manual intervention is particularly important when mapping seabed features because of the ever-present classification inaccuracies arising from the fact that the present suite of visible-spectrum satellites with appropriate spatial resolution for use in coral reef environments were primarily designed for terrestrial work.These instruments lack the spectral and radiometric fidelity to accurately differentiate the key habitats of a coral reef which are optically very similar, even prior to the addition of the compounding influence of submergence under several or more meters of water [26][27][28][29][30][31][32][33][34][35].Furthermore, unlike terrestrial targets, the observation of a seabed from orbit is further hindered by surface effects such as waves, sea surface slicks, and sun glint [24,[36][37][38].Defining the appropriate spatial rules or fuzzy logic to correct for classification inaccuracies arising from these detrimental factors in a map is difficult and often site-specific, such that the same set of rules cannot necessarily be applied elsewhere.Furthermore, defining such rules can be expensive as it often requires extensive local knowledge acquired via ground-truthing by the map producer [25,39].Finally, although manually applying contextual editing by skilled photo-interpreters greatly improves classification accuracy, it also introduces a layer of classifier subjectivity or error that cannot be easily quantified.This subjectivity is compounded by the fact that it is typically logistically unfeasible in a remote reef environment to collect a sufficient number of ground-truth samples to allow enough to be held back from guiding map production to yield a statistically meaningful accuracy assessment.
As a result, in order to produce meter-scale resolution maps of coral reefs across large seascapes, the goal should be to find an automated algorithm that routinely produces an ecologically meaningful level of classification accuracy in order to minimize the amount of time required for manual editing.So as to honestly report on classification inaccuracies, such an algorithm should also calculate a spatially explicit error raster to represent the uncertainty in the classification assignments of the algorithm.After manual contextual editing, one can assume that in general, this uncertainty would be an over estimate of classification error.Explicitly modeling error and providing this information to the end-user is important as error can be propagated through landscape analyses or simulation models that use the map end product as an input, and could affect analytical outcomes [40][41][42][43].
Over the years, various algorithms have been used to accomplish automated habitat classifications.Traditionally, as in terrestrial remote sensing, pixel-based classifiers have been used to map coral reef habitats.For pixel-based image analysis, habitat assignment is based on the spectral properties of each individual pixel in the scene.Various mathematical algorithms have been used to relate pixels to different habitats including maximum likelihood/discriminant function analysis [44,45], multinomial and binomial logistic regression [46][47][48], support vector machine based classifiers [49], band ratios [50], kernel principal components analysis [51], and iterative self-organizing data analysis (ISODATA) [52].Most of these applications have been applied to terrestrial imagery.The use of image texture has also been investigated as a means of identifying optically inseparable reef habitats [53][54][55][56].
More recently, however, object-based classifications, those based on information from a set of contiguous pixels with similar properties, have found favor with the reef mapping community and are starting to be extensively deployed [16,[57][58][59][60][61][62][63][64][65].Object-based classifiers have the advantage over pixel-based algorithms in that they are less compromised by the inevitable inhomogeneity in per-pixel spectral response caused by variations in water depth and sea surface effects.One aspect of the object-based classification workflow is the assignment of each image object to a habitat class on the basis of its spectral, textural, contextual and/or geometric properties.Various methods have been used to classify objects including decision trees [66,67], neural networks [66][67][68], multinomial and binomial logistic regression [66,69,70], machine learning algorithms (i.e., Random Forest) [71,72], maximum likelihood approaches [31], k-nearest neighbor statistics [73], and Bayesian approaches [73].
This paper explores the use of multinomial logistic discrete choice models to classify submerged coral reef features identified through object-based segmentation of satellite imagery.The objectives of this study are to develop an accelerated workflow that allows for more rapid classification of large coral reef seascapes, a reduction in the amount of contextual editing needed, together with a spatial estimate of classification uncertainty, at medium or low cost.The need for an accelerated workflow was inspired by the Khalid bin Sultan Living Oceans Foundation's global reef expedition.as part of this initiative, 10 countries were visited by the mapping team from Nova Southeastern University throughout the world, and over 60,000 square kilometers of coral reef habitat is currently being mapped from 1.8 meter resolution satellite imagery.an object-oriented approach was employed because it has been found to improve classification accuracy in comparison to the pixel-based approach, and it provides a more uniform classification by removing the "salt and pepper" effect that often results from pixel-based classifiers [57][58][59].discrete choice models were selected because they are easy to fit and relatively fast to run, in comparison to other approaches such as using support vector machines [49].In addition, multinomial logistic modelling tools are readily available in most commercial and open source statistical packages.We believe this work to be pertinent given that this strategy has not previously been investigated for mapping coral reef seascapes in the literature.

Methodology
In this study, maps were created in two ways: using multinomial logistic modelling and manually using contextual editing guided by expert interpretation.The purpose for producing maps using these two methodologies was to compare the production of a map using unsupervised classification via a statistical model, to the supervised development of maps using manual editing.This comparison shows which areas of the seascape differ between the two methodologies, and together with an accuracy assessment (described in further detail later in the methodology), can help the reader determine over which areas of a coral reef and under what conditions (i.e., image quality, depth of features) it may be advantageous to apply the algorithm versus deploying a manual editing approach.

Data Acquisition
Imagery from WorldView-2 (WV2), a commercial satellite operated by DigitalGlobe Inc., was acquired for six coral reef atolls (Vanua Vatua, Tuvuca, Mago, Nayau, Fulaga, and Matuka) in the Lao Province of Fiji.WV2 satellite data consist of eight-band multispectral imagery with 1.84 meter resolution.It should be noted, however, that only the first five bands, in order of increasing wavelength penetrate water and are therefore useful for seabed mapping: coastal blue, blue, green, yellow and, to a limited degree, red.The remaining three bands stretch through the infrared and have utility for differentiating land from water, as well as identifying and correcting for sea-surface effects.All imagery used for this project were acquired a maximum of one year prior to ground-truthing and, to the extent possible, scenes were selected free of cloud cover, as well as acquired without sun glint and other surface effects.The data were first radiometrically corrected to pixel units of reflectance at the top of the atmosphere using parameters supplied by Digital Globe [74].Next, an atmospheric correction was applied to yield units of water-leaving reflectance.If necessary, images were processed to eliminate most wave and sun glint patterns from the sea surface [36,37].
An extensive field campaign took place in June of 2013 in the Lao Province aboard the M/Y Golden Shadow as part of the Living Oceans Foundation's Global Reef Expedition.As detailed in Purkis et al., 2014 and2015 [16,75], fieldwork utilized a 6.5 meter day-boat to collect ground-truth data across all habitats and depth ranges using acoustic depth measurements, seafloor videos by a tethered camera, and digital photographs during SCUBA and snorkel surveys.

Image Segmentation, Habitat Definition, and Sample Selection
Once the imagery was mosaicked and prepared for analysis, it was segmented using the multiresolution segmentation model in the eCognition software (Trimble, eCognition Suite Version 9.1).Multiresolution segmentation in the eCognition software is the iterative grouping of pixels with common characteristics into objects, using spectral and shape homogeneity criteria.It is driven by an optimization algorithm and continues until the algorithm has determined an optimum set of image objects from the satellite imagery.This algorithm was selected to segment the images used in this analysis because it yields objects that provide a good abstraction and shaping of submerged reef features using the reflectance of the WV2 bands.The multiresolution segmentation algorithm is informed by shape, scale, and compactness parameters.During this exercise, a scale parameter of 25, shape parameter equal to 0.1, and compactness parameter as 0.5 were used to segment each WV2 image.These parameters were selected because experience using this algorithm in eCognition on WV2 imagery of coral reefs has shown that they provide segmentation with the clearest definition of habitat boundaries.Only the five visible-spectrum bands were used for segmentation of the marine areas.Once segmented, an attribute was assigned to each object in order to designate them as belonging to either the fore reef, back reef, lagoon, land, or deep ocean (beyond the 30 meter detection depth of satellite) zone.Zones were delineated by first classifying the reef crest using a combination of the infrared bands, which are able to pick out emergent or close to emergent portions of the reef crest, together with manual contextual editing.Land was similarly zoned using the infrared bands.Areas of deep water seaward of the reef crest were masked using reflectance threshold values from the blue band number two (reflectance values greater than 0.15).Image areas between the reef crest and deep ocean water were defined as fore reef, while the remaining areas landward of the reef crest were split into either back reef or lagoon, based on depth and proximity to the reef crest (Figure 1b) (note that the atoll represented in Figure 1b happens to not have a lagoon).
Nineteen ecologically meaningful habitat classes were defined for use during classification (Table 1).Sea floor videos and photographs from tethered camera, diving, and snorkeling survey efforts throughout the locations surveyed in Fiji, in conjunction with an honest appraisal of the satellite imagery, and the features it is capable of resolving were used to develop the habitat classes.Survey efforts identified the biology, hydrology, sedimentology, and topography of each site, and used these characteristics to define each habitat class.The colors of the assigned habitat classes in Table 1 correspond to the colors that will be used in the habitat maps throughout this manuscript.

Mud flats
Area of very shallow, fine grained sediment with primarily microbial benthic community.
Deep Ocean Water Submerged areas seaward of the fore reef that are too deep for observation via satellite.
Deep Lagoon Water Submerged areas leeward of the fore reef that are too deep for observation via satellite.

Terrestrial Vegetation
Expanses of vegetation (e.g., palm trees, tropical hardwoods, grasses, shrubs) on emergent features or islands.
Beach sand Accumulations of unconsolidated sand at the land-sea interface on emergent features or islands.

Inland waters
Bodies of fresh or briny water surrounded by emergent features that may either be isolated from or flow into marine waters.

Urban
Man-made structures such as buildings, docks, and roads.

Unvegetated Terrestrial
Soil or rock on islands with no discernible vegetative cover.
Field survey observations from tethered camera, dive, and snorkel, and acoustic depth measurements were taken from all of the sites that we visited in the Lao Province of Fiji.These observations were pooled together and matched to the image objects in the same location segmented from the satellite imagery to form one dataset for all of Fiji.The image object co-located with each field data point was assigned a habitat class using the field observations.Statistical distributions of the spectral signatures, shape, texture, and size characteristics of image objects that were classified by the field data, were calculated for each habitat classification.Calculating the statistical distributions of image object characteristics for each habitat type provided knowledge of the range of values each of these characteristics span for a particular habitat type.This one dataset of statistical distributions for each image object characteristic, was used to select a training dataset of image objects for each site to be mapped (for example, n = 12,588 for Vanua Vatu).The training dataset was developed by randomly selecting 10 percent of the unclassified image objects, and using the characteristics of these unclassified image objects, together with the statistical distributions of the object characteristics, and knowledge of the coral reef system from our visit, to classify each selected image object.The training dataset for Vanua Vatu is represented in Figure 1a as orange points.The training dataset was randomly split into two components, with approximately two-thirds of it (n = 8380 for Vanua Vatu) used to fit the discrete choice models, and the other third (n = 4208 for Vanua Vatu), hereinafter referred to as test data, used for model validation as discussed in the last paragraph of the methods section.
The purpose for developing a training dataset was two-fold.First, a sample size larger than that collected in the field was needed so the multinomial logistic models would converge.Second, in some cases we were unable to access all portions of an atoll that we visited either due to physical barriers such as very shallow reefs or mountains, cultural barriers that limited access to certain locations, or weather events such as high seas that restricted our survey efforts to leeward regions or protected lagoons.Thus developing and using a training dataset for this analysis ensured that a spatially representative dataset across all habitat types and areas of the imagery to be mapped was used to fit the discrete choice models.This is important because of the natural variability in object characteristic values across a satellite scene due to differences in camera angle in different locations throughout the scene, air quality, water quality, and atmospheric moisture, as well as differences between image strips that were mosaicked together to form the scene.

Multinomial Logit Model
A multinomial logistic model was setup to model the selection of habitat class as it relates to object characteristics as defined by image segmentation using eCognition.The multinomial logit model was developed by MacFadden [76] and was based on random utility theory and discrete choice theory in urban economics and behavioral science [77].Multinomial logistic regression is an extension of binomial logistic regression and is well suited for classifying individual observations into one of multiple groups and the technique has been successfully applied across multiple fields including genomics, transportation, remote sensing, and fisheries [78][79][80][81][82].Although this approach is often used to model the decision-making of individuals based on certain individual and environmental characteristics, objects segmented from a satellite image can be considered in a similar way as they also bear individual characteristics, and exist in a surrounding spatial environment that also has its own set of properties.
Multinomial logistic regression is well suited to categorizing unclassified objects as each response variable (object), could be assigned to be more than two possible states (i.e., coral, rubble, seagrass, macroalgae, etc.).The linear predictor of the logit (l), represents the natural logarithmic function of the ratio between the probability (P) that an object (i) is a member of a habitat class (j) and the probability that it is not (1-P).The linear predictor of the logit can be directly calculated for any given habitat class and object, using the estimated parameters from the logistic regression (Equation ( 1)), where a is the intercept, b represents the predictor parameters estimated during model fit, X are the characteristics of each object being classified (i.e., size, shape, brightness, band value, etc.), and k represents the characteristic (i.e., covariate).
The multinomial logistic regression formula calculates the probability that each unclassified object belongs to a certain habitat class.This formulation is derived from Equation (1) for all habitat classes except for the reference variable as: where J equals the maximum number of response variables (i.e., habitat classes) being predicted.Since the dependent variable in multinomial logit models is discrete, one category of the dependent variable is chosen as the reference variable.In our case, the dependent variables were the habitat classes that we were trying to predict, so the reference variable was one of the habitat classes.The probability that an unclassified object belongs to the reference habitat class is calculated as: The multinomial logistic regression assumes independence of irrelevant alternatives, which means that the odds that one habitat is selected over another as calculated by the model, would remain the same, even if additional habitats were added as possible alternatives.
For each population of unclassified objects, the dependent variable follows a multinomial distribution with J levels.
To estimate the parameters, the data are aggregated into populations, y, where each population represented by k and defined as y k , is one unique combination of the independent variable settings (i.e., size, shape, reflectance, texture of a given object in the satellite imagery).The column vector n contains elements n k , which represent the number of observations in each population such that ř N k"1 n k " M, where M is the total sample size (i.e., total number of sample objects used to fit the model).For each population, y, a matrix with N rows (one for each population), and J-1 columns (one for each habitat class being predicted minus the reference group), contains the observed counts of the j th value of each multinomial random variable (i.e., object size, shape, texture, reflectance).The variable P kj is the probability of observing the j th value of the dependent variable for any given observation in the k th population.Given this, the joint probability density function can be expressed as: Note that when J = 2, this reduces to the binomial logistic model.The likelihood function is algebraically equivalent to the probability density function (Equation ( 4)), however it expresses the unknown values of b in terms of known fixed constant values for y.It is calculated by substituting Equations ( 2) and (3) into Equation (4).If we let q represent each independent variable (i.e., object size, shape, reflectance, texture), then the log likelihood function for the multinomial logit can be expressed as: The matrix of parameter estimates were then used to calculate a predicted probability that each unclassified object belonged to one of the habitat classification dependent variables.The dependent variable that was assigned the highest probability was then selected as the habitat classification for that particular object.The probability of misclassification for a particular object was then simply one minus the probability of the selected habitat dependent variable.
Separate multinomial logistic models were fit to the habitats located within each zone.Model fitting was accomplished using maximum likelihood and parameters were added in a forward stepwise fashion using Akaike's Information Criterion (AIC) [83].The R Project for Statistical Computing and the "mlogit" package it contains were used to fit the models on a Windows based laptop computer.Parameters tested in the model included: (1) the mean reflectance of each WV2 band corresponding to the pixels subtended by each object; (2) all combinations of band ratios for bands one through five; (3) object size calculated as the number of pixels that comprise that image object; (4) a shape parameter describing the smoothness of an image object's border as calculated by dividing the border length by four times the square root of the object's area; (5) brightness defined as the color saturation or intensity; (6) texture of the object, represented as the standard deviation of each band, where standard deviation represents the degree of local variability of the pixel values inside each object [66].
Parameters considered in each of the models were selected based on their ability to distinguish submerged reef habitat.For example, different bottom types in shallow, tropical seas with good water clarity have been found to show strong spectral distinctions from one another [21,[28][29][30][31][32].Band ratios were included as factors because they provided an index of relative band intensity, which enhanced the spectral differences between the bands while reducing depth effects [50,84].The size and shape of the image objects that resulted from the segmentation process tended to be dependent on the habitat they represent.Habitat types that reflected more uniformly, such as seagrass beds tended to break into larger, more symmetrical segments, compared to objects that were more rugose, such as coral, which tended to break into smaller, less symmetrical segments.In addition, shape was very useful at pulling out bommies from satellite imagery, especially in deeper water when approaching the edge of the image detection boundary.This is because bommies, if properly segmented out (usually using a smaller scale parameter) tended to be rounder, in comparison to the segments that formed from the surrounding sediment or macroalgae covered lagoon floor.Similarly, texture represented the uniformity of object pixels within an object, with seagrass or sand reflectance being more uniform than that from coral or rubble fields.
Once a final model was determined, that model was then used to predict the classification of the remaining unclassified objects.Classification was predicted by calculating the probability that an unidentified image object belonged to each of the different possible habitats.The habitat with the highest probability for a given object was the habitat that the object was assigned.The probability of that object being misclassified was simply calculated as one minus the probability of the habitat selected.

Accuracy Assessment
In this study, accuracy was assessed by comparing the algorithm produced maps to the test data, which represented a random sampling of one third of the training dataset (n = 4208 for Vanua Vatu) and was not used to fit the models.An error matrix (also often referred to as a confusion matrix or contingency table) was generated by habitat category.The error matrix was used to calculate accuracy by dividing the number of correctly classified test data objects (sum of the diagonal) by the total number of test data objects.Producer and user accuracy metrics were calculated by habitat type.Producer accuracies represented the number of test objects within a particular habitat classified correctly by the algorithm divided by the number of test objects of that same habitat type.Producer accuracies informed the reader how well the objects that made up the training dataset were classified.User accuracies were calculated by dividing the number of correctly classified objects by the algorithm in each habitat, by the total number objects within that habitat.This metric portrayed commission error, the chance that an object classified as a particular habitat actually represented this habitat if you were to visit the exact location in the field.

Results
The methodology described above was run on all six different atolls in the Lao Province in Fiji.Results are presented in detail for one of the atolls, Vanua Vatu, in order to demonstrate the methodology workflow, while, in the interest of brevity, particular aspects of the results from the other atolls are highlighted in order to demonstrate the strengths and limitations of the model.WV2 imagery for Vanua Vatu is presented in Figure 1a, with the locations of tethered camera surveys, SCUBA surveys, and the training data used to fit the models.Processing using eCognition segmented the satellite imagery into 127,513 objects.Figure 1b, shows how Vanua Vatu atoll was partitioned into zones.Figure 2a shows habitat class assignments made by the model for Vanua Vatu, in comparison to a map (Figure 2b) that was developed manually solely using contextual editing.Comparison of the test data image objects set aside from the training data, with the model predictions for these image objects, showed an overall accuracy of 85 percent.The error matrices and accuracy calculations (Tables 2 and 3) correlate to some of the areas where differences exist between algorithm and manual image object classifications (Figure 2).In the back reef zone of Vanua Vatu, the model tended to systematically over predict the placement of coral habitat where dense macroalgae stands were located, and also confused back reef sediment, where rubble existed, and vice versa.On the fore reef, the model was essentially binomial as it was only trying to differentiate between coral and sediment.Although spectrally and texturally very different, these two classes were at times difficult to discern as the fore-reef slopes to depth because of the near exponential attenuation of light by water.On land, the model was unable to differentiate man-made features, (defined as urban in Table 1), primarily confusing both of these classes with beach sand and unvegetated terrestrial areas.Therefore, in order to attain a meaningful accuracy, objects belonging to the urban class needed to be manually classified via contextual editing.It is worthy of note that urban landscapes are notoriously difficult to identify in visible-spectrum satellite imagery [85,86].In addition, objects on land that were determined to be unvegetated terrestrial were often misclassified by the model as vegetated terrestrial.Figure 2a shows habitat class assignments made by the model for Vanua Vatu, in comparison to a map (Figure 2b) that was developed manually solely using contextual editing.Comparison of the test data image objects set aside from the training data, with the model predictions for these image objects, showed an overall accuracy of 85 percent.The error matrices and accuracy calculations (Tables 2 and 3) correlate to some of the areas where differences exist between algorithm and manual image object classifications (Figure 2).In the back reef zone of Vanua Vatu, the model tended to systematically over predict the placement of coral habitat where dense macroalgae stands were located, and also confused back reef sediment, where rubble existed, and vice versa.On the fore reef, the model was essentially binomial as it was only trying to differentiate between coral and sediment.Although spectrally and texturally very different, these two classes were at times difficult to discern as the fore-reef slopes to depth because of the near exponential attenuation of light by water.On land, the model was unable to differentiate man-made features, (defined as urban in Table 1), primarily confusing both of these classes with beach sand and unvegetated terrestrial areas.Therefore, in order to attain a meaningful accuracy, objects belonging to the urban class needed to be manually classified via contextual editing.It is worthy of note that urban landscapes are notoriously difficult to identify in visible-spectrum satellite imagery [85,86].In addition, objects on land that were determined to be unvegetated terrestrial were often misclassified by the model as vegetated terrestrial.The legend showing which habitat type each color in the figure represents can be found in Table 1.
Similar issues were identified in some of the other five sites that were mapped using discrete choice models.In Matuka (Figure 3), for instance, the model systematically over predicted coral in areas where there were extensive seagrass meadows (seagrass producer error of 0.26 user error of 0.63) and missed coral in portions of the lagoon floor (lagoon coral producer error of 0.73 and user error of 0.71).As previously discussed, the model was not able to effectively identify fore reef sand flats (producer error of 0.06 and user error of 0.43), particularly when located close to the depth limit of the sensor's capability (about 30 meters in clear tropical waters).However, in places where the fore reef was characterized by a wider shelf and a gentle gradient, such as the east side of Tuvuca (Figure 4), the model more successfully identified the fore reef sand flats class (producer error of 0.71 and user error of 0.80).The urban features present on the peninsula of land in the center of Figure 3, Matuka, were also not well defined by the model (producer error of 0.52 and user error of 0.68), though in this example, at least some of the urban features were accurately identified.Most disconcertingly, however, was the fact that the model completely misidentified mangroves as terrestrial vegetation.  1.
Similar issues were identified in some of the other five sites that were mapped using discrete choice models.In Matuka (Figure 3), for instance, the model systematically over predicted coral in areas where there were extensive seagrass meadows (seagrass producer error of 0.26 user error of 0.63) and missed coral in portions of the lagoon floor (lagoon coral producer error of 0.73 and user error of 0.71).As previously discussed, the model was not able to effectively identify fore reef sand flats (producer error of 0.06 and user error of 0.43), particularly when located close to the depth limit of the sensor's capability (about 30 meters in clear tropical waters).However, in places where the fore reef was characterized by a wider shelf and a gentle gradient, such as the east side of Tuvuca (Figure 4), the model more successfully identified the fore reef sand flats class (producer error of 0.71 and user error of 0.80).The urban features present on the peninsula of land in the center of Figure 3, Matuka, were also not well defined by the model (producer error of 0.52 and user error of 0.68), though in this example, at least some of the urban features were accurately identified.Most disconcertingly, however, was the fact that the model completely misidentified mangroves as terrestrial vegetation.The legend showing which habitat type each color in the figure represents can be found in Table 1.
Note that in this application the model was able to classify fore reef sediment (light yellow) better than other locations.
In Fulaga, Fiji (Figure 5), the model poorly identified the large swaths of macroalgae on the lagoon floor (producer's accuracy of 0.27 and user's accuracy of 0.68).For this site, however, the model did successfully distinguish coral in the lagoon, in particular small coral patches (i.e., so called "bommies") (producer accuracy of 0.75 and user accuracy of 0.75).Overall, though, the model was particularly adept at delineating seagrass beds, and the coral to sediment and coral to rubble interfaces, even in cases where patches of coral were very small.This is an important strength of the model as it is invariably very time consuming, because of their ornate growth structures [87], to manually delineate coral patches from sediment in the back reef and lagoon zones.
Spatially, uncertainty in model predictions (Figure 6) was highest at habitat transition boundaries (i.e., where one habitat graded into another) and lowest in the middle of homogeneous areas of the consistent habitat.For the application of the model to Vanua Vatu, classifications that the model did not predict very well, such as areas in the imagery with man-made structures that should have been classified as urban, or areas where macroalgae are located adjacent to coral, showed higher uncertainty.
Finally, using the semi-automated approach presented in this study decreased the amount of time needed to create habitat maps by approximately half.For example, the map of Vanua Vatu, an area of 40 square kilometers, was produced in about three days using the semiautomated approach presented, in comparison to about six days if entirely produced using manual contextual editing.1.Note that in this application the model was able to classify fore reef sediment (light yellow) better than other locations.
In Fulaga, Fiji (Figure 5), the model poorly identified the large swaths of macroalgae on the lagoon floor (producer's accuracy of 0.27 and user's accuracy of 0.68).For this site, however, the model did successfully distinguish coral in the lagoon, in particular small coral patches (i.e., so called "bommies") (producer accuracy of 0.75 and user accuracy of 0.75).Overall, though, the model was particularly adept at delineating seagrass beds, and the coral to sediment and coral to rubble interfaces, even in cases where patches of coral were very small.This is an important strength of the model as it is invariably very time consuming, because of their ornate growth structures [87], to manually delineate coral patches from sediment in the back reef and lagoon zones.
Spatially, uncertainty in model predictions (Figure 6) was highest at habitat transition boundaries (i.e., where one habitat graded into another) and lowest in the middle of homogeneous areas of the consistent habitat.For the application of the model to Vanua Vatu, classifications that the model did not predict very well, such as areas in the imagery with man-made structures that should have been classified as urban, or areas where macroalgae are located adjacent to coral, showed higher uncertainty.
Finally, using the semi-automated approach presented in this study decreased the amount of time needed to create habitat maps by approximately half.For example, the map of Vanua Vatu, an area of 40 square kilometers, was produced in about three days using the semi-automated approach presented, in comparison to about six days if entirely produced using manual contextual editing.1.Note that the model had difficulty classifying the macroalgae on the lagoon floor (purple), but was able to select out a lot of small reef structures inside the lagoon (so called "bommies") (red).1.Note that the model had difficulty classifying the macroalgae on the lagoon floor (purple), but was able to select out a lot of small reef structures inside the lagoon (so called "bommies") (red).1.Note that the model had difficulty classifying the macroalgae on the lagoon floor (purple), but was able to select out a lot of small reef structures inside the lagoon (so called "bommies") (red).

Discussion
Although segmentation models, such as those tendered by eCognition, are well-developed, attributing habitat classes to the image-objects that result from segmentation remains challenging to automate for coral reef environments, but is otherwise prohibitively time consuming if affected manually.In the face of the now commonly termed "coral reef crisis", though, regional-scale high-resolution habitat maps have never been in greater demand, particularly since they form the basis for the process of marine spatial planning, a strategy which has found particular favor for coral conservation [88][89][90][91][92][93].
Against this backdrop, this study has shown that although some manual editing of key classes might still be desirable, application of a multinomial logistic model to assign the image-objects created by eCognition into habitats can rapidly yield accurate reef maps.Furthermore, unlike manual attribution of habitats, being computationally efficient, this automated strategy is limited neither by the spatial resolution of the pixels nor the extent of the imagery.These factors conspire to make the model accessible to appropriately trained reef managers, even those in developing regions and even those equipped with a moderate specification laptop computer.
Although the technique did perform well overall, there were still issues with misclassifications.Most of these errors occurred at the interface between habitat types.This could, in part, have been caused by objects that were not cleanly segmented by eCognition and therefore consisted of mixed habitat types (the object version of a mixed pixel), or objects that actually contained mixed pixels, meaning pixels that themselves covered multiple habitat types.That is, the failure was due to the segmentation model as opposed to the attribution of the segments using the multinomial logistic model.In either case, object statistics (such as spectral, textural, size, and shape) for objects that contain mixed habitat types would have been skewed towards the class contributing to the majority share of the segment.This presented a challenge to the classification model as the model may have calculated reasonable probabilities for multiple different habitat types.
In nature, habitats grade from one to another and therefore do not always adhere to the crisp boundaries maps used for marine spatial planning demand [94].These gradations in the satellite imagery result from spatial and temporal heterogeneity in ecology that operate on scales finer than the pixel size of even the most capable satellite instruments (such as WV2).For simplicity, heterogeneity is typically characterized using a patch-mosaic model, where a landscape or seascape is a collection of discrete patches of habitat.In our example, the gradients between these boundaries were a main source of uncertainty in habitat classification; this was clearly captured in the uncertainty raster in the results.
One of the gradients that the model found difficult to resolve was the transition from back reef coral to coral rubble.This is perhaps logical considering the fact that both classes consisted of coral detritus and although this was consolidated in the former and fragmented in the latter, the spectral response of the two habitats was inseparable in the comparatively broad spectral bands of the WV2, as compared, for instance, to a hyperspectral sensor [20,28,[95][96][97][98].A second gradient that the model found difficult to resolve in certain locations was the differentiation between macroalgae on sediment and coral framework.Again, such confusion is to be anticipated given the strong chlorophyll signature of both habitats in the visible wavelength [99,100].In addition, if a reef had suffered significant coral mortality and had been overgrown by macroalgae, this structure may look similar to adjacent patches of macroalgae on sediment preventing the routine use of satellite imagery to detect coral bleaching [50,[101][102][103][104].Discerning these two habitat types using satellite remote sensing remain difficult, despite the fact that aerial photography and hyperspectral approaches have made advances in these areas [19,21,105,106].Our own ground-truth data confirmed that separating algae on coral framework from framework that contained a high percentage of live coral cover was especially challenging.In this case, both structures shared the same reef-like geometry, precluding the use of characteristics such as habitat shape as a diagnostic feature.
The rapid attenuation of visible light with increasing water depth made it more difficult for the model to distinguish between habitats at deeper depths.This was particularly evident in the fore reef environment when trying to distinguish deep sand channels from deep reef habitat, and also on deep lagoon floors, when distinguishing between barren lagoon floor and coral habitat, or barren lagoon floor and macroalgae.In the fore reef environment, however, this did not always register by the model as increased uncertainty.This is likely because on the fore reef, the algorithm only had to distinguish between two different types of habitat, coral and sand, where most of the image objects on the fore reef were coral.Therefore, there was a good chance that the algorithm would "guess" the correct habitat type, especially at the outer edges of imagery detectability, where even sandy areas can appear dark like reef habitat.To this end, incorporating various combinations of band ratios was helpful for discerning habitat types in deeper waters, and these factors were typically determined to be statistically significant in the model.Band ratios provide an index of relative band intensity that enhances the spectral differences between the bands while reducing the effects of depth [107].The five primary visible bands in the WorldView-2 satellite imagery offer a number of combinations of band ratios that can be used to help distinguish between habitat types [84].
Terrestrially, the model also had trouble distinguishing gradients.One area where this was a particular issue was between areas that our ground-truthing had determined to be vegetated versus unvegetated.This may have been because areas that contained exposed earth often had some low lying vegetative cover, such as low grasses, bushes, or other brush.Thus, such objects may actually have been mixed classes, consisting of some exposed soil, but not exclusively so, and therefore understandably difficult for the model to distinguish.As such, one change in the analysis that may have improved the classification could have been to split terrestrial vegetation into two classes, defining tree canopy and low grasslands or shrubs.
Similarly, the model had a very difficult time distinguishing between terrestrial vegetation that contained tree canopy, and mangroves.Work has been done before successfully to distinguish mangroves from tree canopies of other species, and to discern different mangrove species from one another, however these efforts have largely utilized hyperspectral data which is more nimble at detecting subtle spectral differences or LiDAR which can key off of geometric differences in canopy structure [108][109][110][111][112].These efforts have involved region-merging segmentation together with maximum likelihood, nearest-neighbor, and integrating pixel and object-based approaches, and receiver operating characteristic curve analysis [113,114].Spectral analysis has been done to differentiate between several species [115,116].Specifically, using WV2 satellite data, as used in this study, work has been done to characterize mangrove vegetation using semi-variograms combined with field data and visual image interpretation, looking at different object and pixel resolutions, and using support vector machines [117][118][119].Due to the connectivity between mangrove habitat and the overall health of coral reefs and reef fish populations, future refinements of this model should try to incorporate elements from successful studies to improve the ability to distinguish mangrove canopy from that of other terrestrial tree species.
Finally it is important to acknowledge that the need for an analyst to develop a training dataset to populate algorithms such as that presented in this paper, in the absence of an abundance of field data spanning the entire spatial domain, contains limitations.As discussed in the methodology, the purpose for developing and using a training dataset was to reduce the observation error associated with model fit and habitat prediction.Training datasets developed using expert judgement, however, even when grounded in some quantitative field data as in this study, are subject to both error and bias.Algorithm performance can naturally be improved if both the bias and error are reduced.Although models that fit the training data closely may have low bias, it does not necessarily mean that the estimated model parameters, which are then used to predict the classifications of all image objects, are error free.Furthermore, error from using a misspecified training dataset is passed on during model prediction, and this observation error is unable to be distinguished from model process error.Therefore, training datasets must be developed with great care to ensure that observation error is as close to zero as possible.
Along these same lines, the authors acknowledge that accuracy assessment typically compares results to field data, which are assumed to be 100% correct.However, due to the absence of an abundance of field data in this study, model results were compared to a test dataset defined as one third of the training dataset, which was set aside and not used to fit the models.In this case, the test dataset used for accuracy assessment was assumed to be 100% correct, which may not have been the case as per the discussion in the above paragraph.As such, the modelled results may have shared some biases with the training dataset, which are unable to be parsed out from the error matrices.

Conclusions
Many coral reef cartographers face the challenge of having to map a large spatial area, using limited field samples, and the inability to automate the workflow.This leads to copious amounts of time spent manually producing maps or contextually editing products on which automated routines were not very successful.Despite the limitations described in this manuscript, using a multinomial logistic model to perform an initial automated pass at classifying coral reef habitat from satellite imagery was able to attain assignment accuracies exceeding 85 percent.This drastically reduced the time needed to produce the map, despite the need to still perform some contextual edits to correct some misclassifications.The probabilistic nature of the model provides for an estimation of uncertainty across space that can be provided to the end-user as a supplement to the habitat map.This is something that can only be provided using a statistically based approach and is not possible to provide when exclusively developing a map manually using contextual editing.Whether developing marine spatial planning scenarios or populating spatial models from reef habitat maps, an estimate of uncertainty across space is essential towards understanding the impact of proposed management efforts and ecological processes.
In many cases, subtidal coral reef habitats of ecological and anthropogenic importance exist in remote locations that are financially and logistically difficult to access.Furthermore, most developing countries with coral reefs lack the resources necessary to conduct the extensive field campaigns that would make possible the collection of a comprehensively sampled ground-truth dataset.Yet, these places play central roles in protecting biodiversity and sustaining local economies by providing food, attracting tourism, and offering the coastline some shelter from storms, among other things.Due to the breadth of ecosystem services supplied by coral reefs, in addition to the anthropogenic stressors they face, the development of detailed maps is paramount to their effective management.As a result, if an informed analyst familiar with the reef system being mapped can make his or her best efforts to discern a training dataset and use this to produce a first version of a habitat map, this would at least provide a foundation with which the marine spatial planning process can begin.Further refinement of the map product can always be done by collecting additional field data and/or by obtaining additional input from individuals who live in communities adjacent to the reef system that is being studied.

Figure 1 .
Figure 1.(a) Natural colored WorldView-2 satellite imagery acquired in 2012 for Vanua Vatu atoll from Digital Globe.The points on the figure represent tethered camera locations in green, dive survey locations in red, and the training data used to fit the models in orange.(b) Zones defined for Vanua Vatu based on infrared and visible band thresholds and contextual editing.

Forward, stepwise
model selection resulted in model fits for Vanua Vatu with McFadden's R-squared values of 0.62, 0.72, and 0.57 for the back reef, fore reef, and land models, respectively.Detailed tables for Vanua Vatu showing likelihood ratio test results from the stepwise model selection process, and parameter estimates for the final back reef, for reef and terrestrial models, are provided in the supplementary materials that accompany this article.

Figure 1 .
Figure 1.(a) Natural colored WorldView-2 satellite imagery acquired in 2012 for Vanua Vatu atoll from Digital Globe.The points on the figure represent tethered camera locations in green, dive survey locations in red, and the training data used to fit the models in orange.(b) Zones defined for Vanua Vatu based on infrared and visible band thresholds and contextual editing.

Forward, stepwise
model selection resulted in model fits for Vanua Vatu with McFadden's R-squared values of 0.62, 0.72, and 0.57 for the back reef, fore reef, and land models, respectively.Detailed tables for Vanua Vatu showing likelihood ratio test results from the stepwise model selection process, and parameter estimates for the final back reef, for reef and terrestrial models, are provided in the supplementary materials that accompany this article.

Figure 2 .
Figure 2. (a) Model predicted habitat classifications for objects in Vanua Vatu, Fiji.Note that areas classified as deep ocean or reef crest were not predicted and were assigned during zone assignment as described in Figure 1.In addition, note that object classifications assigned based on field observations or classified because they were part of the training dataset are included in the figure.(b) Habitat classifications for Vanua Vatu, Fiji using a manual, contextual editing approach.The legend showing which habitat type each color in the figure represents can be found in Table1.

Figure 2 .
Figure 2. (a) Model predicted habitat classifications for objects in Vanua Vatu, Fiji.Note that areas classified as deep ocean or reef crest were not predicted and were assigned during zone assignment as described in Figure 1.In addition, note that object classifications assigned based on field observations or classified because they were part of the training dataset are included in the figure.(b) Habitat classifications for Vanua Vatu, Fiji using a manual, contextual editing approach.The legend showing which habitat type each color in the figure represents can be found in Table1.

Figure 3 .
Figure 3. (a) Model predicted habitat classifications for objects in a subsection of Matuka, Fiji.Note that areas classified as deep ocean or reef crest were not predicted and were assigned during zone assignment as described in Figure 1.In addition, note that object classifications assigned based on field observations or classified because they were assigned as sample locations are included in the figure.(b) Habitat classifications for a subsection of Matuka, Fiji determined using a manual, contextual editing approach.The legend showing which habitat type each color in the figure represents can be found in Table1.Note that for this location, the model overestimated the classification of lagoon coral (red) and underestimated the classification of seagrass (light green).Similarly, the model had difficulty classifying fore reef sediment (light yellow).

Figure 3 .
Figure 3. (a) Model predicted habitat classifications for objects in a subsection of Matuka, Fiji.Note that areas classified as deep ocean or reef crest were not predicted and were assigned during zone assignment as described in Figure 1.In addition, note that object classifications assigned based on field observations or classified because they were assigned as sample locations are included in the figure.(b)Habitat classifications for a subsection of Matuka, Fiji determined using a manual, contextual editing approach.The legend showing which habitat type each color in the figure represents can be found in Table1.Note that for this location, the model overestimated the classification of lagoon coral (red) and underestimated the classification of seagrass (light green).Similarly, the model had difficulty classifying fore reef sediment (light yellow).

Figure 4 .
Figure 4. (a) Model predicted habitat classifications for objects in Tuvuca, Fiji.Note that areas classified as deep ocean or reef crest were not predicted and were assigned during zone assignment as described in Figure 1.In addition, note that object classifications assigned based on field observations or classified because they were assigned as sample locations are included in the figure.(b) Habitat classifications for Tuvuca, Fiji determined using a manual, contextual editing approach.The legend showing which habitat type each color in the figure represents can be found in Table1.Note that in this application the model was able to classify fore reef sediment (light yellow) better than other locations.

Figure 4 .
Figure 4. (a) Model predicted habitat classifications for objects in Tuvuca, Fiji.Note that areas classified as deep ocean or reef crest were not predicted and were assigned during zone assignment as described in Figure 1.In addition, note that object classifications assigned based on field observations or classified because they were assigned as sample locations are included in the figure.(b) Habitat classifications for Tuvuca, Fiji determined using a manual, contextual editing approach.The legend showing which habitat type each color in the figure represents can be found in Table1.Note that in this application the model was able to classify fore reef sediment (light yellow) better than other locations.

Figure 5 .
Figure 5. (a) Model predicted habitat classifications for objects in Fulaga, Fiji.Note that areas classified as deep ocean or reef crest were not predicted and were assigned during zone assignment as described in Figure 1.In addition, note that object classifications assigned based on field observations or classified because they were assigned as sample locations are included in the figure.(b) Habitat classifications for Fulaga, Fiji determined using a manual, contextual editing approach.The legend showing which habitat type each color in the figure represents can be found in Table1.Note that the model had difficulty classifying the macroalgae on the lagoon floor (purple), but was able to select out a lot of small reef structures inside the lagoon (so called "bommies") (red).

Figure 6 .
Figure 6.Model classification uncertainty for Vanua Vatu, Fiji, represented as the probability that an object was classified incorrectly.Darker colors represent objects with higher classification uncertainty.

Figure 5 .
Figure 5. (a) Model predicted habitat classifications for objects in Fulaga, Fiji.Note that areas classified as deep ocean or reef crest were not predicted and were assigned during zone assignment as described in Figure 1.In addition, note that object classifications assigned based on field observations or classified because they were assigned as sample locations are included in the figure.(b) Habitat classifications for Fulaga, Fiji determined using a manual, contextual editing approach.The legend showing which habitat type each color in the figure represents can be found in Table1.Note that the model had difficulty classifying the macroalgae on the lagoon floor (purple), but was able to select out a lot of small reef structures inside the lagoon (so called "bommies") (red).

Figure 5 .
Figure 5. (a) Model predicted habitat classifications for objects in Fulaga, Fiji.Note that areas classified as deep ocean or reef crest were not predicted and were assigned during zone assignment as described in Figure 1.In addition, note that object classifications assigned based on field observations or classified because they were assigned as sample locations are included in the figure.(b) Habitat classifications for Fulaga, Fiji determined using a manual, contextual editing approach.The legend showing which habitat type each color in the figure represents can be found in Table1.Note that the model had difficulty classifying the macroalgae on the lagoon floor (purple), but was able to select out a lot of small reef structures inside the lagoon (so called "bommies") (red).

Figure 6 .
Figure 6.Model classification uncertainty for Vanua Vatu, Fiji, represented as the probability that an object was classified incorrectly.Darker colors represent objects with higher classification uncertainty.

Figure 6 .
Figure 6.Model classification uncertainty for Vanua Vatu, Fiji, represented as the probability that an object was classified incorrectly.Darker colors represent objects with higher classification uncertainty.

Table 1 .
Habitat name, description, and legend key for figures in the manuscript.

Table 2 .
Error matrix comparing model predicted habitats (rows) for test data image objects (one third of the training data set aside, and not used for model fitting) with the test data habitat assignments made using ground-truth data for Vanua Vatu.Values represent numbers of image objects.

Table 3 .
User and producer accuracies calculated for Vanua Vatu from the error matrix in Table2.

Table 3 .
User and producer accuracies calculated for Vanua Vatu from the error matrix in Table2.