Comparison of Two Independent Mapping Exercises in the Primeiras and Segundas Archipelago, Mozambique

Production of coral reef habitat maps from high spatial resolution multispectral imagery is common practice and benefits from standardized accuracy assessment methods and many informative studies on the merits of different processing algorithms. However, few studies consider the full production workflow, including factors such as operator influence, visual interpretation and a-priori knowledge. An end-user might justifiably ask: Given the same imagery and field data, how consistent would two independent production efforts be? This paper is a post-study analysis of a project in which two teams of researchers independently produced maps of six coral reef systems of the archipelago of the Primeiras and Segundas Environmental Protected Area (PSEPA), Mozambique. Both teams used the same imagery and field data, but applied different approaches—pixel based vs. object based image analysis—and used independently developed classification schemes. The results offer a unique perspective on the map production process. Both efforts resulted in similar merged classes accuracies, averaging at 63% and 64%, but the maps were distinct in terms of scale of spatial patterns, classification disparities, and in other aspects where the mapping process is reliant on visual interpretation. Despite the difficulty in aligning the classification schemes clear patterns of correspondence and discrepancy were identified. The maps were consistent with respect to geomorphological level mapping (17 out of 30 paired comparisons at more than 75% agreement), and also agreed in the extent of coral containing areas within a difference of 16% across the archipelago. However, more detailed benthic habitat level classes were inconsistent. Mapping of deep benthic cover was the most subjective result and dependent on operator visual interpretation, yet this was one of the results of highest interest for the PSEPA management since it revealed a continuity of benthos between the islands and the impression of a proto-barrier reef.


Introduction
Coral reef mapping from remotely sensed images is now a well-established practice, as attested by the many published papers since the early 1990s [1], books [2,3] and availability of large satellite imagery archives, and commercial image service providers.Remote sensing methods have superseded visual interpretation and manual delineation of aerial photography [4][5][6], and with the development of high spatial resolution multispectral satellite sensors (pixels < 5 m) the space data now approach the resolution of aerial photography.The digital format of satellite data facilitates computer analysis, and a variety of algorithms for producing habitat maps have been described, from basic per-pixel classification [7][8][9][10] to object based image analyses(OBIA) [11][12][13][14].However, regardless of the mapping algorithm used, there are many aspects of map production in which operator decisions still affect the outcome: (1) choice of algorithm and parameterization; (2) choice of classification scheme; (3) visual quality checking and reprocessing; and (4) final corrections of misclassifications, i.e., contextual editing [15].In particular, in the map production process the first analysis is rarely accepted; typically a cycle of visual assessment, adjustment and reprocessing occurs.When a map is produced for an end-user, manual correction of "obvious" misclassifications is obligatory, and the mapping process effectively becomes based on visual interpretation.While the literature contains many comparisons of different classification methods [7,8,16,17], invariably these are conducted by the same individuals or in a working-group interaction, where operator subjectivities are minimized because the aim is to assess the algorithms.However, this does not address the end-users question: generally, how reliable and consistent is map production?
The consideration of operator subjectivity in mapping quality is rarely discussed in remote sensing literature.Andréfouët [18] addresses the complexity in evaluating the human influence in the map production process, and particularly of the ones usually referred to as "experts".One reason for the lack of information on operator subjectivity is that an analysis requires two independent mapping exercises on the same site, ideally from the same imagery and with the same field data, and this is rarely feasible.In this paper we have taken advantage of such a situation that arose from a cooperation between European Space Agency's (ESA) G-ECO-MON project to evaluate the use of remote sensing for ecosystem services, the World Wide Fund for Nature (WWF), and Lund University, Sweden.The resulting products were two sets of benthic habitat maps for six of the islands in the Primeiras and Segundas Environmental Protected Area (PSEPA) in Mozambique: one produced by pixel based classification methodology, the other following an OBIA approach.The satellite imagery and field data were the same for both analyses and the habitat maps showed similar accuracies according to their respective methodologies in comparison to field data.However, in addition to utilizing different algorithms, the completely independent map production comprised different classification schemes, different interpretation of the field data, and different visual quality assurance processes and contextual editing.The results have not gone through any post-comparison revision.These maps are therefore an example of the possible variation from two specific providers on the same mapping task, and give an indication of both the differences and consistencies between two wholly independent production chains.

Study Area
The Primeiras and Segundas Environmental Protected Area (PSEPA) is located in northern Mozambique, extending over 200 km of coastline, from Pebane to Angoche.The area was declared protected in late 2012, and includes mangroves, seagrass beds and diverse coral reef habitats that support a biodiversity rich ecosystem [19].Of the two archipelagos, distributed parallel to the coastline, only the northern, Segundas, was included in this work.The archipelago consists of seven islands, but the southern-most, Moma, was excluded from the study as field data were not collected (Figure 1).
The islands are quite small, with a maximum length of about 1 km.The smaller islands support little or no vegetation while the larger islands have some forested area.Each island is surrounded by fringing reefs in a semi-circle shape to the southeast, where massive coral colonies occur sporadically [20].The lagoons, made of sand, coral rubble and seagrass beds, are shallow and some parts are practically exposed during low tide [20,21].The islands have relatively exposed northern, eastern and southern sides, the latter being usually subject to monsoon influenced trade winds heading northeast during the summer or wet season (October to March) and southwest in winter or dry season (April to September) [22,23].In general, the most developed and species diverse reefs have been reported in the most sheltered regions of the coral reef systems, i.e., facing the mainland [20].The local waters are turbulent, not only due to upwelling but also to predominant tidal waves and irregular sea floor bathymetry, resulting in strong and quite variable currents.The main direction of the offshore surface currents is southeast, while currents at depth (100-150 meters) move towards north [23].
Remote 2016, 8, 52 3 of 20 bathymetry, resulting in strong and quite variable currents.The main direction of the offshore surface currents is southeast, while currents at depth (100-150 meters) move towards north [23].

Satellite Imagery Data
A mixture of 4-band WorldView2 (red, blue, green, and near-IR) and QuickBird archive imagery were used for mapping.The scenes, one per island, were selected according to the best available visibility defined by reduced extent of whitecaps and sun glint, visible deep benthic features, and generally clear waters (Table 1).Due to frequent rough seas and terrigenous plumes, clear imagery occurs infrequently, and the biggest time discrepancy between imagery and the in-situ data collection was four years.Although it is likely that the benthos has undergone some changes during this period, the structural components of the system, i.e., coral and rock, should be fairly consistent.The imagery underwent standard radiometric and sensor correction using calibration coefficients from the provider [24].

Satellite Imagery Data
A mixture of 4-band WorldView2 (red, blue, green, and near-IR) and QuickBird archive imagery were used for mapping.The scenes, one per island, were selected according to the best available visibility defined by reduced extent of whitecaps and sun glint, visible deep benthic features, and generally clear waters (Table 1).Due to frequent rough seas and terrigenous plumes, clear imagery occurs infrequently, and the biggest time discrepancy between imagery and the in-situ data collection was four years.Although it is likely that the benthos has undergone some changes during this period, the structural components of the system, i.e., coral and rock, should be fairly consistent.The imagery underwent standard radiometric and sensor correction using calibration coefficients from the provider [24].

Benthic Cover Data
In-situ point data covering the six coral reef systems were collected by the Lund University group in two surveys, one from 15 to 17 April 2014 on the islands Mafamede, PugaPuga, Baixo Miguel and Njovo, and the second on 10 and 11 May 2014 on the islands Caldeira and Baixo Santo Antonio.The methods by which field data can be collected in this region are affected by tidal and water clarity variation as well as weather conditions.Benthic cover was observed from the boat using a clear bottom bucket: a qualitative description of the substrate and visible features were recorded, very similar to the resulting OBIA level 3 classes.Underwater photographs were taken at selected locations to illustrate different benthic cover types.Geographical coordinates were captured with a Garmin Montana 650t GPS (horizontal accuracy ˘3.65 m) [25].Data points were taken from a slow moving boat at intervals of about 80-150 meters (as defined by the GPS receiver), but also according to observable changes of the benthic cover.Although initially planned as transects, the routes had to be adjusted to the tidal and geomorphologic characteristics of the islands for safety reasons.Data were collected by circumnavigating the reef crest, defining some transects in the lagoon and by visiting zones of interest previously defined according to the satellite imagery.The point count was intended to be approximately 200 points per island.However, due to bad weather and unsafe navigable conditions, the final dataset varied from 24 to 139 points per island, totaling about 660 geolocated points documenting benthic cover (Table 2).A more detailed description of the sampling methodology is given in Teixeira et al. [26].

Mapping Methodologies
The two map production efforts consisted of a per-pixel classification approach conducted under the G-ECO-MON project, henceforth denoted GEM, and an OBIA conducted by the Lund University group, denoted Lund.The results of both approaches included raster classification maps with per-pixel correspondence to the source images, after the conversion of the Lund polygon data.As such, the results were compared per-pixel.The classification schemes used were different, as were the methods for accuracy assessment with field data, detailed in the following sections.

GEM Production Effort Processing and Classification
First, the images for Njovo and Baixo Santo Antonio were corrected for water surface sun-glint using the near infra-red (NIR) band [27], while the other images were considered sufficiently free of glint that correction was not required.The purpose of the glint correction is to remove wave surface patterns that would otherwise dominate the classification, resulting in spectral reflectances values closer to how they would be if the glint was absent, and therefore more in line to those of the other images.However the correction is not perfect and residual noise may remain [28].Images were then subject to a spatial filtering step where adjacent pixels of similar reflectance were replaced by their mean spectra, then applied iteratively until the number of distinct pixels was reduced to 10%.The spatial scale of the merging was very small (typically 3 ˆ3 pixels) and this step was applied to reduce single pixel scale noise in the classification.Following the recommended practice of initially classifying to at least twice the number of required classes, 60-class unsupervised k-means classification was applied.The resultant codes were ascribed to a reduced classification scheme (Table 3) by visual interpretation using historical spatially approximate Rapid Assessment documents and generalized field data.Classification was then validated using the actual contemporary geolocated field data.There were insufficient field data points to partition the data into training and validation sets, so all of the field data were reserved for accuracy assessment.Use of visual interpretation and local knowledge for training data is not only sometimes a practical necessity [29] or inherently useful [30] but it is also inevitable: no map producer would discount what they know when quality checking their classification results.Hence, operator a-priori knowledge is always relevant to map production and rarely independent from the data used for accuracy assessment.With this in mind it is important to note what background information informed the class attributions: no member of the GEM group had visited the site but several Rapid Assessment documents were available from 1997 to 2010 that described the benthic composition in roughly defined areas [21,31,32]; the in-situ data later used for accuracy assessment was also available but this was only referenced in rough geographical terms, i.e., to understand the nature of the benthic composition inside versus outside the lagoons and to define the classification scheme.

Sand with thin vegetation
Sand or rubble with some seagrass and/or macroalgae cover but not sufficient to obscure the visibility of the sand substrate.

6
Dense vegetation Dense cover of seagrass or macroalgae sufficient to obscure the substrate below.

Reef flat/coral and rubble field
Habitat that contains coral in a relatively sheltered environment, typically behind the reef crest and at the edge of the lagoon.Can also be spur and groove zone.

8
Reef crest/high coral cover Places where coral is found in highest density, typically at the edge of the reef and top of the reef slope.9 Reef slope/ fore reef High coral cover region that slopes up to the reef crest from outside the lagoon, may contain soft corals.Spur and groove formations also found here.

Deep benthic cover
Dark benthos barely visible in imagery, at depths 10 to 20 m, the exact nature of which cannot be determined.Could be vegetation (seagrasses or macroalgae) or deep reef structures.

Waves or clouds in the image
Regions where the benthos cannot be classified because breaking waves or clouds obscure it in the source image.
12 Sand on rock substrate Corresponds to areas classed as sand in the in-situ data but they are typically raised, probably on consolidated rubble, and with thin vegetation.
No depth correction or depth-invariant index calculation [33,34] was applied.Given that the sites were in large part a relatively flat shallow lagoon, with the only depth variation on the fore reef slope, it was judged that merging of the 60 classes would be adequate to handle the existence of different classes due to depth.Finally, contextual editing was used to correct misclassifications as judged by visual interpretation.
The classes used were chosen by considering the structure of the field data, and the delineations that arose naturally from unsupervised classification of the imagery.The classes were defined at habitat level and include both biotic composition and geomorphological zone.In multispectral imagery of four bands (red, green, blue, and near infra-red), only two or three convey subsurface information, so it is generally not possible to differentiate between seagrass and macroalgae by spectral reflectance alone; in fact, the field data indicated that seagrass and macroalgae often occurred together.Therefore the classification scheme included classes based on general "vegetation" which could be either seagrass or macroalgae.Typically a-priori contextual knowledge by users is required to identify areas which are dominated by seagrass.Most habitats were on a continuum where classes "sand", "sand with thin vegetation" and "dense vegetation" were not precisely definable in terms of percentage cover but arose due to spectral gradation of image pixels.No specific level of coral cover is implied in the classes "reef slope", "reef crest" and "reef flat", but these can be considered places where live coral would be found.

GEM Production Effort Accuracy Assessment
Due to the difficultly in clearly delineating classes where mixed assemblages are common (both in the imagery classification and in-situ data) accuracy assessment was conducted by merging classes to three basic cover types: dominated by coral, sand and vegetation (seagrass or algae).This was also necessary because the in-situ data contained only small number of instances of some classes, and even with this merging, the "vegetation" class was absent in the in-situ data at three of the six islands.Confusion matrices for these three classes (and including "deep water" for the map data) were constructed for the six islands.

Lund Production Effort Processing and Classification
The first pre-processing step applied in the OBIA was to maximize the visual contrast of the images by a combination of radiometric correction, principal component analysis, dark object subtraction, sun glint correction according to Hedley et al. [27] and water column correction according to Lyzenga [33,34].
The imagery was iteratively segmented and classified according to a three level hierarchical classification scheme based on shallow water coral reef environment research work by Mumby and Harborne [35], Rohmann [36] and Andréfouët [37].The classification scheme (Table 4) was adapted to the field data and imagery so as to maximize the variety of benthic habitats included.Segmentation was performed using Trimble eCognition Developer's multiresolution segmentation algorithm, applying a decreasing scale parameter (100 to 5), while compactness and color were kept almost constant (0.7 to 0.9).Due to the variation within the images, the parameters were adjusted to the individual datasets.Classification was performed using the Nearest Neighbor classification algorithm, including Feature Optimization tools.Training datasets were created by the operator based on image interpretation informed by the field data, together with empirical knowledge of the coral reef systems.The Lund team had physically visited the site and collected the in-situ data, but in attempt to promote independence from the in-situ data sets, the training sites were selected from areas not covered by the field data.
Maps were created for each level of the hierarchical classification, resulting in three maps per reef system.Finally, contextual editing was applied to correct misclassifications according to visual interpretation.Further information on the image processing can be found in Teixeira et al. [26].

Lund Production Effort Accuracy Assessment
Confusion matrices were generated for the resulting bottom cover and benthic habitat maps (levels 2 and 3) using the field data points.All classes present in the map were included in the accuracy assessment, despite insufficient distribution of point data across classes.

Mapping Results Comparison
Since there was no previous intention of conducting a comparative study, the classification schemes were developed independently, which led to relatively few classes having a direct 1:1 correspondence between the two map production efforts and class grouping being required (Table 5).
For geomorphological categories, level 1 in the Lund maps, identifying corresponding classes or groupings was fairly straightforward for "Land", "Shallow surround" (shallow areas outside the fore reef), "Reef crest", "Fore reef" and "Deep water".Likewise, the bottom cover (level 2) maps were used in the assessment of "Deep benthic cover" (dark benthos of unknown composition in deep water), "Coral containing" areas (places where live coral would be found), and three different comparisons for bare sand or rock, "Bare 1-3".The benthic habitat classes at level 3 were considered too specific to provide a relevant assessment of direct map correspondence.Therefore, it was not possible to compare mixed classes with vegetated cover directly, so all classes that included seagrass and/or brown macroalgae were aggregated and compared to the GEM class "Dense vegetation".A comparison between GEM "Dense vegetation" and Lund "Sand" was also included because it was apparent from visual interpretation that this could have a high correspondence.
The maps from the two production efforts were initially compared by visual assessment, supported by the calculation of the edge similarity index for the pairs of classes referring to coral, vegetation and deep benthic cover.This index, ranging from 0 to 1, represents the percentage of overlap of class boundaries, thus assessing geometric correspondence between the maps products.As recommended by Lizarazo [38], the edges were considered as overlapping within a tolerance zone, here defined as a buffer of approximately two times the sensor ground resolution, i.e., a total of four meters for WV-2 and five meters for QB 2 images.Additionally, each island's map was overlaid to develop confusion matrices of class correspondence between the GEM and Lund products.This led to a total of 18 confusion matrices quantifying the spatial agreement of the selected classes.To understand the consistency of the mapping results across the six islands, Wilcoxon Signed-Ranks test and linear intercorrelation were applied to the number of pixels coinciding for each selected class pair.Wilcoxon's p-value for paired samples indicates the probability that the observed results would happen if the null hypothesis were true, i.e., if the median of the differences of the samples were zero.Thus, small values of p-value show that there is a systematic over-or underestimation over the six islands, whereas a non-significant result means either the agreement was very good between the two production methods or the production methods disagreed but in a non-systematic way.The coefficient of determination of the linear intercorrelation provides an evaluation of how much the quantification resulting from one methodology is reliable as a predictor for the quantification of the other.In both production efforts each island was mapped independently, so evaluating the results across the six islands will reveal any systematic biases in the production methods, encompassing both mapping algorithm and operator dependent factors.
The applicability of the maps as natural resources management tools was assessed by evaluating the consistency of the maps in answering hypothetical questions based on information needs from the PSEPA's management: (1) What is the extent of coral containing habitats per island, and the total within the protected area?(2) What is the extent of macroalgae or seagrass areas per island, and the total within the protected area?(3) Where are areas of deep benthic cover to be found, possibly currently unknown and a future focus for field surveys?
The values of the extents of coral, vegetation and deep benthic cover were estimated according to the class groups already presented, although this grouping could be different depending on the interpretation of the end user, particularly for the study of vegetation.

Accuracy Assessment, Edge Similarity and Class Agreement Results
Accuracy assessment with respect to the field data indicated the maps have variable but generally reasonable accuracy: 56% to 72% for GEM and 43% to 93% for Lund (Table 6).These values are considered adequate for management and planning purposes, for which a value of about 60% is generally recommended [3,39].Moreover, the obtained results fall within the range of values found in current research results of coral reef habitat mapping [7,12,18,40].Therefore, both production efforts resulted in maps that were generally of a defensible quality in a benthic habitat mapping context.A first, rough visual assessment of the mapping outputs (Figure 2), conducted by overlapping the three OBIA maps on the pixel based ones, allowed the identification of clear differences.While land and the overall shape of the coral reef system matched well, the deep benthic classes-deep benthic cover and deep sand-were often more extensive in GEM maps.Another observation is that the benthic habitat (level 3) Lund maps show more clearly delineated areas, particularly within the lagoon and reef crest, while avoiding the "salt and pepper" effect, which is a frequently identified advantage of the object based approach [14,41,42].
The edge similarity results support the visual interpretation, quantifying the deep benthic cover edge correspondence as ranging from 2% to 16% (Table 7).The remaining class comparisons, more complex to assess visually due to the number of classes included, shows a general good level of agreement, with an edge similarity index in order of 10% to 30% for coral containing classes and above 30% for vegetation in the majority of the locations.
The confusion matrices comparing the map pairs show very high (>75%) class agreement for "Land", "Fore reef" and "Deep water" in all the six study locations (Table 8).Two of the comparisons related to sand or rock substrate, "Bare 1" and "Bare 2", had a very high agreement for about half of the locations, while "Bare 3" (Sand over rock substrate vs.Rock) had very high agreement only at one.The "Coral containing" comparison had very good agreement at three sites and poor agreement at only one.Additionally, from the 18 generated confusion matrices, it was possible to observe that "Sand on rock substrate", "Dense vegetation" and "Sand with thin vegetation" display a high level of agreement with the class "Lagoon" from the Lund classification scheme.This result was expected, and supports adequate spatial correspondence of classes, as these three are generally found in that geomorphological zone.The class comparison "Shallow surround", "Bare 2" and "Deep benthic cover" show a very low agreement (<50%) for the majority of the locations.The GEM classes "Deep sand" and "Deep benthic cover" mostly coincided with the Lund class "No information", which represents locations where it was considered the sea bottom was not visible, which included not only deep water but also cloud cover, white caps and sun glint.As the number of pixels with the above-surface features is much smaller than those over deep water, "No information" was used as a direct equivalent of "Deep water".These results support the visual interpretation of the maps, where the majority of the deep feature identified in the pixel based maps corresponds to deep water in the object based maps.Identifying the deepest bottom features by visual interpretation requires using various stretches of the imagery to infer where features are and so is to some extent subjective, being reliant on the effort put by the operator into it.Nevertheless, the human eye remains more powerful than automated analysis for identifying subtle and noisy spatial patterns so visual interpretation is a valuable tool in deep areas, even if it does not comprise of a systematic analysis.In particular, the fundamental limitation in deep waters is that relatively few photons that reach the sensor have interacted with the bottom, and no processing algorithm can compensate for this [43].The greater extent of the "Deep sand" and "Deep benthic cover" classes in the pixel based production (Figure 2c-f, Figure 3c-f and Figure 4c-f) is probably in large part due to the effort and interpretation of the operator.
When comparing the number of pixels mapped for the selected class pairs, it is clear that results are in general similar for class pairs at the geomorphological level (Figure 5), but less so at the benthic habitat level (Figure 6).However, there is good agreement for the comparison of Coral containing areas and Vegetation.

Systematic Biases in Production Methods
The results show low p-values (<0.001) for half of the classes, confirming that there is indeed systematic over-or underestimation of the mapped classes (Table 9), but this could arise simply from the fact that the class groupings do not quite represent the same things in the two analyses.However the comparisons for "Shallow surround", "Reef crest", "Fore reef", "Deep benthic cover", "Bare 3" and "Coral containing" do not exhibit statistically significant systematic over-or underestimation (i.e., p > 0.05).

Systematic Biases in Production Methods
The results show low p-values (<0.001) for half of the classes, confirming that there is indeed systematic over-or underestimation of the mapped classes (Table 9), but this could arise simply from the fact that the class groupings do not quite represent the same things in the two analyses.However the comparisons for "Shallow surround", "Reef crest", "Fore reef", "Deep benthic cover", "Bare 3" and "Coral containing" do not exhibit statistically significant systematic over-or underestimation (i.e., p > 0.05).
This result is not surprising, as most of the class groupings don't correspond exactly to each other, due to the similarity in the islands configuration and composition, systematic over-or underestimation might be expected, rather than a combination of both.It is interesting to notice that the classes likely to represent the same extent refer to features with either very clear spectral signal (such as shallow sand), easy to identify visually (such as reef crest), or fairly stable features (such as This result is not surprising, as most of the class groupings don't correspond exactly to each other, due to the similarity in the islands configuration and composition, systematic over-or underestimation might be expected, rather than a combination of both.It is interesting to notice that the classes likely to represent the same extent refer to features with either very clear spectral signal (such as shallow sand), easy to identify visually (such as reef crest), or fairly stable features (such as geomorphological structures and coral), although their correlation was low.On the other hand, the classes that indicated under-and/or overestimation obtained a high level of agreement, likely due to the quite often larger extent of the object based results for the selected pairs (Figure 5).By plotting the extent of the pairs of classes against each other, strong linear correlations (r 2 > 0.75) were found for the class pairs that showed a high level of agreement across the study area (Table 8), as well as null Wilcoxon's Ws (Table 9).This means that half of the analyzed pairs of classes, despite having distinct areas, were identified spatially with high consistency and have proportional dimensions throughout the Primeiras islands in the PSEPA.Weak linear correlations (r 2 < 0.35) were found for the pairs of classes with higher values of W, meaning that although the classes were as statistically likely to have the similar extents, their variation didn't show the same behavior, or over-or underestimation.

Consequences for Management
While by standard quality assessment methods both production efforts lead to maps of good accuracy, when directly compared only a limited number of classes, corresponding to the most straightforward types of signal (land and deep water), show a high agreement.The majority of the more "complex" classes, potentially the most interesting from a management perspective have mediocre agreement at best.Additionally, class extents showed discrepancies.This leads to questions regarding map accuracy, and the validity of the metrics used to quantify it.In essence, the key question is: are the maps fit for purpose?To answer this, we posed hypothetical questions based on information needs from the PSEPA's management, and evaluated the consistency of the maps in answering them.A manager would estimate a total area of almost 7 km 2 containing habitats in the Primeiras archipelago of the PSEPA using the GEM maps and of about 8 km 2 using the Lund maps.These results show a relative difference of about 16%, and indicate a good level of consistency between the map production efforts.However, when looking at the coral estimations for each coral reef system, the relative differences range from 6% to 96%, averaging at about 50% (Figure 7).Higher percentage discrepancies occur for smaller physical differences when areal extents are low, so the improved agreement at archipelago level may simply be a function of scale, especially since there is no systematic discrepancy in coral containing area estimation (Table 8).That is, the comparison at each coral reef system suffers from "small sample statistics".Regarding the hypothetical question from PSEPA's management, the two production efforts could be equally used to answer-results were highly consistent in estimating coral containing areas at the archipelago level, but not for the majority of the individual coral reef systems.Due to the nature of the site and in-situ data, macroalgae and seagrass have been grouped as "vegetation", for which quantification by either production efforts produces quite similar results (Figure 7) with an average overestimation by the GEM production effort of about 28%.The relative difference in vegetated area estimation for each coral reef system ranges from 9% to 63%.At the archipelago level, the GEM maps refer to 2.8 km 2 of vegetated cover and Lund's results to 2 km 2 .This difference, less than 1 km 2 , represents about 25% variation of the estimated area of vegetation in the six coral reef systems included in this study.Similarly to what happens regarding coral cover assessment, a manager could use either map production efforts to obtained a quite consistent overview of vegetation cover in the study area, but less consistent for individual coral reef systems.
PSEPA's management, the two production efforts could be equally used to answer-results were highly consistent in estimating coral containing areas at the archipelago level, but not for the majority of the individual coral reef systems.

What Is the Extent of Macroalgae or Seagrass Areas per Island, and the Total within the Protected Area?
Due to the nature of the site and in-situ data, macroalgae and seagrass have been grouped as "vegetation", for which quantification by either production efforts produces quite similar results (Figure 7) with an average overestimation by the GEM production effort of about 28%.The relative difference in vegetated area estimation for each coral reef system ranges from 9% to 63%.At the archipelago level, the GEM maps refer to 2.8 km 2 of vegetated cover and Lund's results to 2 km 2 .This difference, less than 1 km 2 , represents about 25% variation of the estimated area of vegetation in the six coral reef systems included in this study.Similarly to what happens regarding coral cover assessment, a manager could use either map production efforts to obtained a quite consistent overview of vegetation cover in the study area, but less consistent for individual coral reef systems.Deep benthic cover shows higher discrepancies, and its extent was on average more than 60% larger in the GEM maps.In the total of the study area this leads to a significant difference-26 km 2 according to the GEM maps vs. 8 km 2 according to Lund results.The reason was likely that the GEM production relied on ascribing 60 unsupervised classes by visual interpretation, hence visual interpretation played a significant role in the mapping the extent of these features (Figures 2c,d,  3c,d, 4c,d).The reduction of accuracy with depth is an expected behavior, and when the bottom reflection measured by the satellite falls below the sensor noise level, it is no longer possible to classify accurately [43,44].Although adequate for locating deep features, the produced maps would be unreliable for quantification and change detection of deep benthic cover.Nonetheless, detection and delimitation of these previously unknown features was of great interest to the PSEPA management.In some of the imagery outside of the processed area, these features could be seen to be virtually Deep benthic cover shows higher discrepancies, and its extent was on average more than 60% larger in the GEM maps.In the total of the study area this leads to a significant difference-26 km 2 according to the GEM maps vs. 8 km 2 according to Lund results.The reason was likely that the GEM production relied on ascribing 60 unsupervised classes by visual interpretation, hence visual interpretation played a significant role in the mapping the extent of these features (Figures 2c,d, 3c,d  and 4c,d).The reduction of accuracy with depth is an expected behavior, and when the bottom reflection measured by the satellite falls below the sensor noise level, it is no longer possible to classify accurately [43,44].Although adequate for locating deep features, the produced maps would be unreliable for quantification and change detection of deep benthic cover.Nonetheless, detection and delimitation of these previously unknown features was of great interest to the PSEPA management.In some of the imagery outside of the processed area, these features could be seen to be virtually continuous between the coral reef systems and are highly suggestive of the state of a proto-barrier reef.Although this interpretation corresponds to a very basic visual assessment of the remote sensing output, the information is sufficient to support future research efforts in the region.This underlines that, in the effort to develop new remote sensing algorithms, it is important not to lose sight of the value that basic visual interpretation can provide for coral reef management.

Conclusions
The work presented here sheds some light into the influence of both algorithm choice and operator subjectivity in mapping exercises.By comparing results from distinct workflows, based on the same data, it has been possible to identify various aspects of both algorithm and operator influence.where in some case the results were in good agreement, in others discrepancies arose.Specifically: (1) The differing production efforts resulted in maps that were consistent at geomorphological level (24% of categories with greater than 75% agreement) but less so at habitat level (19% of categories with greater than 75%).( 2) Perhaps contrary to expectation, systematic under-or overestimation according to mapping approach was not ubiquitous.It varied mostly with class complexity but for six cross-comparisons, there was no statistically significant over-or underestimation of areal extent, including for coral-containing areas.
(3) Despite differences in the specific habitats used in the classification schemes, basic management questions on coral cover and vegetation lead to consistent answers; such as the total coral containing area being assessed to agreement within 16%.Therefore, the maps could be judged as fit for management purposes such as the monitoring of habitats containing coral, although more at the archipelago level than for individual coral reef systems.(4) Operator influence was strongest with respect to areas of deep benthic cover.These features are best identified by visual interpretation and automated quantitative analysis is unreliable.(5) Comparing the maps from the different production efforts was better in revealing inconsistencies and weakly identified classes than accuracy assessment based on in-situ data.
The last point implies that independent map production efforts could in themselves be a quality control method when in-situ data are lacking.Both map production efforts faced this issue, and dealt with it differently-GEM by merging classes, Lund by not assessing some.In fact, in the same way that weather model uncertainties are identified by running model ensembles, map uncertainties could also be assessed by conducting two or more independent mapping exercises.While "expensive on map production", this does have the virtue of not being reliant on in-situ data, the collection of which may be infeasible or even more expensive.

Figure 1 .
Figure 1.Locational map of the study area.

Figure 1 .
Figure 1.Locational map of the study area.

Figure 5 .
Figure 5. Paired class comparison across the different coral reef systems at the geomorphological level; GEM refers to the pixel based mapping products, and Lund to the OBIA mapping.

Figure 5 .
Figure 5. Paired class comparison across the different coral reef systems at the geomorphological level; GEM refers to the pixel based mapping products, and Lund to the OBIA mapping.

Figure 6 .
Figure 6.Paired class comparison across the different coral reef systems at the bottom cover and benthic habitat levels.

Figure 6 .
Figure 6.Paired class comparison across the different coral reef systems at the bottom cover and benthic habitat levels.

4. 3 . 1 .
What Is the Extent of Coral Cover Containing Habitats per Island, and the Total within the Protected Area?

4. 3 . 2 .
What Is the Extent of Macroalgae or Seagrass Areas per Island, and the Total within the Protected Area?

Figure 7 .
Figure 7. Extent of coral, vegetation and deep benthic cover according to the GEM and Lund maps.

4. 3 . 3 .
Where Are Areas of Deep Benthic Cover to Be Found, Possibly Currently Unknown and a Future Focus for Field Surveys?

Table 1 .
Sensor, acquisition date and time and visibility assessment for the imagery covering each location.
3.2.Benthic Cover DataIn-situ point data covering the six coral reef systems were collected by the Lund University group in two surveys, one from 15 to 17 April 2014 on the islands Mafamede, PugaPuga, Baixo Miguel and Njovo, and the second on 10 and 11 May 2014 on the islands Caldeira and Baixo Santo Antonio.

Table 1 .
Sensor, acquisition date and time and visibility assessment for the imagery covering each location.

Table 2 .
Field data distribution per location.

Table 3 .
Habitat classification scheme applied in the GEM mapping production effort.
1 Land Above water at time of image acquisition.2 Deep water Bottom can't be seen in image.3 Deep sand Sand where bottom is only just visible, typically more than 10 m depth.4 Shallow sand Relatively clean sand cover, approximately less than 10 m depth, may contain some rubble and very thin vegetation.

Table 4 .
Three level hierarchical habitat classification scheme applied in the Lund mapping production effort.

Table 5 .
Pairing of selected classes for comparisons.

Table 6 .
Accuracy assessment results for the pixel based and object based maps.

Table 7 .
Edge similarity index values for results for selected pairs of classes.

Table 8 .
Class agreement matrix results for selected pairs of classes.
Extent of coral, vegetation and deep benthic cover according to the GEM and Lund maps.4.3.3.Where Are Areas of Deep Benthic Cover to Be Found, Possibly Currently Unknown and a Future Focus for Field Surveys?