Understanding the underlying drivers and causal factors determining the existence and sustainability of coral reefs has been propelled by the rapid degradation of these ecosystems [1
]. These issues include Crown-of-Thorns outbreaks [3
], coral bleaching and mortality [4
], and damage from tropical storms [6
], in addition to the impacts of sedimentation, nutrient run-off, and pollution [7
]. Given the large footprint and cumulative effect of perturbations caused by these stressors, understanding their net effect on coral reef communities requires system-wide analysis, and generates research and management demand for broad-scale (>10,000 km2
), standardised data sets. This is especially important given the rapid ocean warming and acidification, effects of which are predicted to produce large-scale changes that are beginning to occur across the planet at regional and global scales [8
While meta-analysis can be an efficient way to synthesise underwater assessment efforts and generalise within and throughout regions [10
], variability across spatial scales, multiple observers, metrics and methodologies can pose serious challenges for broad generalisations [1
]. Alternatively, integrative and multi-disciplinary approaches using extensive field observations, optical remote-sensing datasets (satellite- and aerial-derived products) and modelling tools have the potential to enable investigators to scale up field observations in order to understand processes driving change in coral reefs [6
]. Irrespective of the approach used to understand reef functioning and change, there is a fundamental need for broad-scale and standardised field data to accurately record and understand reefs under transition in order to provide informed management advice in a timely manner [1
Optical remote sensing provides broad-scale aerial coverage of coral reef systems (e.g., 10,000 km2
) with every pixel assessed relative to its habitat composition and with pixel size determining the level of benthic detail mapped, resulting in 100% coverage of the study area [18
]. However, optical remote-sensing products do not provide sufficient detail and reliability when compared to field-based measurements [19
]. Conversely, field-based observations, while critical for the calibration of remote-sensing imagery and validation of the resulting maps [17
], typically cover only small areas (<1%) of the study site [21
Over the past three decades, underwater photography and videography has become increasingly accessible and is now widely used for monitoring coral reef benthic communities. Recent advances in digital photographic technology have enabled more efficient ways of obtaining observations and collecting data on the state of coral reef ecosystems [17
]. Underwater vehicles, as well as diver-acquired methods [24
], have also extended the capability of capturing large volumes of photographic records. Such is the case of the approach we evaluate here, the XL Catlin Seaview Survey (CSS), a method aimed at evaluating spatial and temporal patterns of benthic community structure in coral reefs using high-resolution imagery collected across linear transects (~2 km in length) by a customised underwater diver propulsion vehicle [24
]. Insofar as the necessity for field data persists, the underlying challenge is shifting from a focus on the generation of information to a focus on the capacity of new tools to decode such information into meaningful metrics, which can extend our understanding of how coral reefs are impacted by a rapidly changing environment.
The challenge of rapid and accurate analysis of large volumes of images has led to productive collaborations between marine and computer science. While automated image analysis is extensively used in satellite image analysis [28
] and plankton ecology [29
], its application to coral reef systems is relatively new. Such methods, which typically rely on machine learning to map visual attributes of images to semantic classes, are enabling marine scientists to extract useful ecological data from photographic records [30
] at speeds significantly faster than manual methods [24
]. Furthermore, the development of photographic sensors and computer vision methods has enabled integration of new approaches in coral reef ecology to quantify a range of other metrics relevant to the discipline (e.g., reef terrain complexity [33
] and fish abundance [35
Previous studies have revealed the potential of automated image annotation to rapidly optimise data mining from underwater imagery and generate reliable ecological metrics relative to coral reefs [30
]. While there is high fidelity between human and automated annotations, the latter tend to introduce a level of variability or “noise” to the benthic coverage estimations, perhaps attributed to changes in image quality over time and space (e.g., light, water clarity and distance of the camera from the substrate, etc.
]. This raises the question of whether such methodologies as those used in the CSS produce viable outcomes for detecting and monitoring reef-scale changes in benthic composition over time and at large spatial extensions (>10,000 km2
Here, we validate the application of automated image analysis on field imagery collected by the CSS [24
] with a central aim of scaling up underwater observations of coral reefs. In particular, we explored and analysed the error introduced by automated image classification when it is used to estimate changes in benthic community composition across large spatial (regional) and temporal scales (years). Using imagery collected at multiple sites across 49 reefs from the Great Barrier Reef (GBR) and Coral Sea Commonwealth Marine Reserve (CSCMR), we discuss the fidelity of automated and human estimations in the context of variability introduced by multiple observers, in order to understand, describe and offer potential applications and limitations of this technology.
Overall, the percentage cover estimated from automated annotations captured spatial and temporal patterns of benthic community composition across the GBR and CSCMR, but with higher quantification errors the than inter-observer variability among human annotators. Our study reveals that machine estimates can measure changes in benthic composition, over time and space, with a minimum detection threshold ranging from 2% to 12% among 19 benthic categories. Using this approach, the method described here has the capacity to gather ecological metrics at kilometre scales with consistently low errors. The generation of standardised, high-definition and spatially sound datasets, via the methods described here, presents an opportunity to fill key data gaps (e.g., stock assessment, biodiversity, temporal trends) [17
] and to track and understand the functional attributes of coral reef systems across broad temporal (e.g., years to decades) and spatial (e.g., >10,000 km) scales. This methodology is limited by a narrower taxonomic precision when compared to many smaller and more controlled photographic or in-situ
surveys. These are clearly trade-offs that need to be considered in applying “hands-on” versus
semi-automated survey technologies [22
]. Advances and further improvements in image capturing and analysis tools are likely to reduce this limitation over time and are further discussed herein.
4.1. Error across Space
Mean estimations of the spatial error, introduced by the automated analysis, varied among categories and averaged 2.5%. These errors increased with the functional aggregation of communities (e.g., diversity and dissimilarity indices). Overall, the relative impact on this error of the data interpretations will depend on the relative abundance of the organisms, taxonomic resolution and the ecological relevance of the variability recorded. While we observed a relatively low estimation error, the noise introduced by the automated analysis may lead to misinterpretations of rare categories for which the average abundance is similar to the quantification errors estimated here (2%–3%). The impact of automated analysis errors on the assessment of more dominant benthic groups, on the other hand, is less pronounced. For example, the mean abundance of hard corals in the GBR and CSMCR ranged from 21% to 31%, in accordance with other studies [50
], and the error of automated estimations averaged to <5%. The errors reported here fall within the expected values for these sites considering the complexity of their respective substrates.
Such variability in the representation of rare and dominant categories is carried over in community structure metrics, where indices of diversity and evenness are sensitive to the abundance of rare categories [85
]. Hence, greater variability in the automated estimations of diversity and evenness was observed, although the ability of the machine to capture spatial trends was preserved. Community structure estimation errors can be summarised by measuring the dissimilarity in the community assemblage estimated by the machine when compared to a human annotator, following a traditional approach in community ecology [86
]. In this study, we observed that automated estimations of benthic composition differed, on average, from the reference (human estimations) by 25% (Bray-Curtis dissimilarity). However, large heterogeneity of benthic community assemblages has been found in the GBR and CSCMR [83
], where dissimilarity metrics of benthic assemblages typically range from 40% to 60% [84
], and concurs with these results as observed in the Bray-Curtis dissimilarity of reference sites across the GBR and CSCMR (Figuew 3C). Therefore, while the method described here provides advantages for broad-scale community assessments, restrictions may apply to fine comparisons of benthic assemblages where automated annotation errors may obscure subtle differences among communities.
4.2. Error in Detecting Change over Time
The automated annotator estimated the temporal change of benthic composition with similar levels to the spatial error recorded (2%–12%, mean among labels). As with intra-year comparisons, the noise added by the automated annotator for temporal change relative to community composition may affect the interpretation of subtle changes, suggesting the applications of this approach as more suitable for mid-range temporal scales (years or decades) as opposed to subtle inter-seasonal fluctuations. As a reference, coral cover has decreased in the GBR by as much as 25%–30% over the past three decades [6
], while less representative coral species can fluctuate around 5% [83
]. Our results suggest the detection of subtle temporal variations of coral categories (<5%) may be hindered by the noise of automated estimations. Nonetheless, this approach has the capacity to provide detailed and broad-scale information on significant temporal changes (>2%–12%, depending on the categories) in coral reef benthic communities, thus providing data required for assessing causes and consequences of accelerated coral reef degradation over large extents [1
4.3. Sources of Error
Errors introduced by the automated estimation of percentage cover of benthic groups across the GBR and CSCMR (spatial and temporal errors) can be attributed to two methodological caveats: (a) image quality as a result of variability in reef appearance, underwater light irradiance and distance of the camera to the substrate; and (b) complexity of the label set, where groups enclosing many species with high morphological variation and phenotypic plasticity introduce a large variability due to overlapping visual features that challenges classification [30
]. Since the imaging technique does not use artificial light, underwater light irradiance and reef light reflection pose imaging challenges. To compensate for this, fish-eye lenses, high-dynamic range cameras, and flexibility to adjust the camera ISO (International Standards Organization) settings on the fly, optimise the amount of light captured by the camera [24
]. An on-board altitude sensor records the distance from the camera to the substrate, which enables the selection of only those images taken within a range from 0.5 to 2 m above the substrate to maintain a fixed resolution for the imagery [24
Taxonomic resolution and morphological plasticity within the label set can also affect the capacity of automated and manual methods to accurately estimate composition and abundance [23
]. Quantification of benthic composition from underwater images is limited by the level of taxonomic identification that can be resolved [23
], whereas high-taxonomic resolution (e.g., species level) requires quantifying micro-scale morphology and internal structures of the reef organisms. In addition, species and groups of species exhibiting large morphological variations [88
] may have visual attributes or features that overlap among groups, therefore hindering the capacity of automated classification to accurately resolve these classes. Furthermore, depending on the taxonomic aims of the study, taxonomically challenging categories are more prone to human errors, which are carried across to machine estimations from training data sets [22
]. Therefore, the classification reference or label set needs to be designed in such a way that it conveys the taxonomic resolution which is functionally relevant for the intended study while acknowledging the taxonomic limitations of underwater image analysis, both manually and automatically [30
4.4. Future Directions
While we acknowledge a higher taxonomic resolution can be achieved from underwater images [22
], here we used a conservative approach. In this study, we amplified the benthic categorization compared to a previous study [24
] whilst maintaining a relatively low number of taxonomic groups categorised by their functional traits (Table 1
) and ensuring minimal overlap of visual features among labels. Further research is needed to evaluate the feasibility of expanding the resolution of taxonomic identifications and the capacity of automated methods to discern among labels.
Looking forward, our results indicate that there is room for improvement and the errors reported here can be further reduced by two orthogonal developments. The first involves recent development in the field of automated image analysis using deep Convolutional Neural Networks (CNNs) which have dramatically increased the classification accuracy for a wide range of images [89
]. The second involves the opportunity to complement the RGB camera used here with multi-spectral or fluorescence cameras [91
]. In this case, collecting additional spectral information could improve the ability to detect additional spectral signals, thereby improving precision and accuracy when distinguishing visually overlapping categories.