1. Introduction
Indonesia is a country with abundant coastal and marine natural resources, including coral reef resources. Along with several neighbouring countries in the Asia-Pacific region such as the Philippines, Papua New Guinea, Timor Leste, Malaysia, and the Solomon Islands, Indonesia is incorporated in the Coral Triangle Initiative (CTI), which is an association of countries belonging to the center of coral reef biodiversity in the world and was established with the aim of preserving the natural resources of coral reefs. This underwater ecosystem has a high strategic function in national development and turning the wheels of the national economy, including through the food security sector (supporting the sustainability of fish resource stocks), tourism (through snorkeling and diving), and coastal protection (from coastal erosion caused by currents and waves). Furthermore, the role of coral reefs is essential to President Joko Widodo’s program in realizing Indonesia as the World Maritime Axis (PMD). The preservation of coral reefs will be of significant support to Indonesia’s ideals of becoming a PMD as regards food security through marine fishery resources (pillar number 2), and the economy through maritime tourism (pillar number 3). Given the importance of coral reefs to the nation, information regarding the spatial distribution of its conditions is very important.
Benthic habitats mapping activities including coral reef are quite challenging with the use of remote-sensing, and this led to the development of approaches to obtain a higher classification accuracy [
1,
2]. Per-pixel classification algorithms were commonly used in conducting the mapping process, especially at a general level of benthic habitat complexity [
3,
4]. The development of algorithm was further enhanced, using information on object’s texture, shape, neighborhood, and other spatial aspects in Object-based Image Analysis (OBIA) [
5,
6]. Recently, machine learning non-parametric classification algorithms such as Support Vector Machine (SVM) and Random Forest (RF) have also been adapted and showed promising accuracy [
7,
8,
9]. Improving the accuracy is not only limited to finding the suitable classification algorithm but also on finding the most suitable benthic habitat classification scheme for the specific remote-sensing data [
10], integrating various datasets such as hyperspectral image, aerial photo, and bathymetry [
7,
9], incorporating active and passive remote-sensing systems [
11], and conducting the mapping procedure, using hierarchical classification process [
5].
Recent improvements of spatial and spectral resolution of multispectral images challenged us to produce a more accurate benthic habitat map. However, the available spectral bands to penetrate the water body is limited to visible bands and hindered by the low signal-to-noise ratio (SNR) of shorter-wavelength bands [
8], which make the mapping more challenging. Consequently, only benthic habitat mappings using four classes’ scheme (i.e., coral reefs, seagrass, macro-algae, and sand), generally have higher accuracy [
2]. However, the accuracy is relatively lower for mappings at detailed scheme [
12,
13,
14]. Furthermore, [
15] have established that even at 0m, most benthic habitats with similar pigmentation are difficult to be spectrally-differentiated, and with the presence of water-column energy attenuation effect, as the water-depth increases, so does the difficulty of spectrally-separating benthic habitat. This occurs especially when detailed benthic habitats classification scheme is desired. Furthermore, the accuracies of detailed benthic habitat mapping vary greatly with images, and the methods employed, as well as the complexity of benthic habitat environment and composition of each study area [
2,
16].
A commonly used classification algorithm for benthic habitats mapping is Maximum Likelihood (ML) [
3,
4,
12,
13,
14,
17]. This classification is considered ideal if the spectral response of benthic habitat has Gaussian distribution and the training area of benthic habitats feature in the dataset is normally distributed. In reality, these assumptions are not always met, especially for complex benthic habitat environments. Machine-learning techniques such as SVM, RF, and Neural Net (NN) are more suitable to accommodate these issues and produced higher accuracy than ML [
7,
9,
18], since they do not require such assumptions to work effectively.
Another issue of benthic habitat mapping using the remote-sensing method is the need to obtain field data for the training of the classification algorithm each time the classification is carried out. As a consequence, benthic habitat map for areas with limited access is scarce or maybe not even in existence, whereas, these hardly accessible areas shelter rich biodiversity of benthic habitats. Many of these areas remain unmapped mainly as a result of this issue, which will impact on the time required to conduct the mapping, as well as the required costs for mapping logistics, i.e., transportation, accommodation, surveyor, and device insurance. Therefore, it is also essential to develop a benthic habitat mapping model that can be applied to different areas with a relatively consistent accuracy, and this became the main purpose of this research.
We tested three machine-learning classifications algorithms (RF, CTA, and SVM) in order to classify benthic habitats, and obtain the parameter of its most accurate map to be adapted to other areas. Machine-learning algorithm has the capability of developing a parameter and a set of decision rules that can be saved and adapted to other images given they have similar inputs. The classification was performed at pixel level given that it has the benefit to maintain the precision of the detailed benthic habitat variations at this level, which is in some way sacrificed in the OBIA. We made use of two levels of classification scheme complexities to assess the performance of the model. SVM and RF were previously used to map benthic habitat by [
7,
8,
9,
18]. Therefore, by using SVM and RF, our work can be compared, widely-recognized, and can be put into context into bigger benthic habitat mapping framework. Meanwhile, it is also necessary to propose CTA algorithm for benthic habitat mapping, as to enrich the selection of possible machine learning algorithms for benthic habitat mapping.
This research was conducted in Karimunjawa Islands (
Figure 1). These islands shelter high biodiversity of benthic habitats [
14,
19]. The substrate is mainly dominated by carbonate sand. Rubble also present in water adjacent to the shoreline, and red-colored volcanic sand dominates the substrate along the shoreline of Kemujan Island. The development of the mapping model was conducted on Kemujan Island, which was selected as the representative area to develop the mapping model due to the high variations of reef ecology and morphology, and water-depths [
14]. Karimunjawa, Menjangan Besar and Menjangan Kecil Islands were used in assessing the applicability of the mapping model.
3. Results
3.1. Classification Results
Benthic habitat classification had its highest accuracy obtained from RF with 88.54% overall accuracy. The mean overall accuracy is 88.05 ± 0.29%. These were obtained from RF model using Gini coefficient to determine impurities in a node and Square root of all features to determine the number of randomly selected features. The selected number of trees is 100. This accuracy is very high, given the number of classes involved and the complexity of the classification scheme. RF produced a better accuracy than other algorithms, not only based on the statistics of accuracy assessment result, but also on the spatial distribution of the benthic habitats across the scene. Seagrass and macroalgae classes were classified along the shoreline. Reef-flat on the southern part of the scene was classified as brown algae and sand. The lagoon located on the western part of the island was also classified as healthy, intermediate or dead coral. Sand was correctly classified in the lagoon, especially in the back-reef area. The reef-crest and fore-reef were mainly classified as healthy coral reef with some mixture of intermediate coral reef. The misclassification between coral reef and seagrass in the reef-crest area and in the boundary between optically shallow and optically deep water did not occur on RF result but was noticeable in CTA and SVM results (see red polygon in
Figure 4). However, there are also areas in the Northern part of Kemujan Island where seagrass was misclassified as coral reef, and only CTA was able to produce the correct classification (see blue boxes in
Figure 4).
In RF result, a misclassification occurred between healthy coral, intermediate, and dead coral as they share similar spectra and class descriptor. Healthy coral was also misclassified as EaTh and sand. Brown algae class was mainly misclassified as mixed-algae and sand, which was expected, given that mixed-algae also contains brown algae, while mixed-algae was misclassified as coral reef classes due to similar pigmentation, resulting in similar reflectance. Ea was misclassified as brown algae since its reflectance covered by epiphyte resembles that of brown algae [
42]. EaTh was misclassified as mixed-seagrass and Th, as the spectra of these classes are overlapping [
42]. Ho was mainly misclassified as intermediate coral, Th as brown algae and EaTh, and ThCr as Th. These misclassifications can be attributed to the association of Th or ThCr with brown algae, especially
Padina sp. and
Dictyota sp. The misclassification of Th as EaTh and ThCr as Th was as a result of the overlapping spectra of Th in these classes. Mixed-seagrass was understandably misclassified as ThCr, since ThCr spectra are also mixed-seagrass spectra. Furthermore, mixed-seagrass was also misclassified as sand. Sand was misclassified as brown algae, Ea, healthy coral, and intermediate coral, as sand was the dominant substrate for these classes and almost all benthic classes in the study area. See the confusion matrix of RF result for the detailed information in
Table 4.
The highest accuracy from CTA was obtained at 77.8% when using DII. The accuracy of CTA using other inputs was just as high with a mean overall accuracy of 75.39±2.17%, which was higher than the mean classification accuracy of SVM. When all inputs were being utilized, the accuracy of CTA was 77.17%. Ho class has a very low accuracy and rubble produced a total misclassification. Ho is rarely found in a large bed and high density, and thus the dominant resultant reflectance is still highly affected by a sandy background. It was also found adjacent to brown algae
Padina sp., and thus, was mostly misclassified as sand and brown algae. Rubble had the worst accuracy with zero classification accuracy. Given that rubble is commonly found in between and adjacent to coral reef of various conditions, all the validation samples were classified as healthy coral reef, intermediate coral reef, or sand. The class descriptor of healthy and intermediate coral reef class also included rubble as the minor component. Rubble had the same material with carbonate sand, and was thus easily misclassified as sand, which had more dominant coverage than rubble. See
Table 5 for the confusion matrix of CTA result.
The best CTA results were obtained by using 1% auto-pruning threshold. Based on our experiments, which involved using 5%, 10%, 15%, and 20% auto-pruning threshold, the accuracy decreased on 5% threshold, but increased on 10%. Afterwards, the accuracy kept decreasing on 15% and 20% auto-pruning threshold. In addition to the declining accuracy, the analysis revealed that a high percentage of auto-pruning thresholds should not be recommended for mapping with complex classification scheme. This is due to the fact that it will eliminate some classes when the pixel in the leaf composing these particular classes does not meet the auto-pruning threshold criteria. For instance, at 10% threshold, the accuracy of CTA using DII increased to 89%. However, the remaining benthic habitat classes were only eight.
SVM is the third-best algorithm with an overall accuracy of 75.98%. This was obtained using DII input, with a mean accuracy of 74.27 ± 1.04%, which is slightly lower than CTA mean. These were obtained from the following settings. The
C value was set to 10.00 and
g value was set to 0.001. Meanwhile, the multiplier setting for
C and
g value during model tuning was set to 10. Despite the accuracy, SVM failed to classify the rubble class, and several seagrass classes such as ThCr, CrHu and Ho (
Table 6). This is contradictory to the work of Reference [
9], where seagrass variations can be correctly classified using SVM. This difference can be attributed to the differences in benthic habitat environmental complexity of each study area. Reference [
9] classified areas covered mostly by seagrass of different conditions, and thus lowered the misclassification rate of seagrass class to other benthic habitats.
The main setback of running SVM algorithm using high number of training samples is the time required to perform the classification. The time required to run SVM classification using area samples adds up to almost four hours per classification process. We tried running the SVM algorithm using the newest computer processing hardware, but there was no significant decrease in the processing time compared to older computer hardware. Thus, although the accuracy of SVM is quite similar to CTA, the productivity is much lower compared to CTA, especially if we are to experiment with various scenarios of SVM parameter settings. The summary of overall accuracy for each classification result is illustrated in
Table 7. The results of machine learning classification are provided in
Figure 5.
McNemar test was performed to select the ideal model to be applied on other islands, and RF produced the highest classification accuracy. Based on this test, the performance of RF model using all dataset was not significantly different from deglint bands, and for this reason, we selected the model from RF deglint bands. The RF model was selected from deglint bands because it is more consistent and can be widely applied. The use of DII, PC bands, and bathymetry will improve the complexity of the required input and lead to the non-standard input values in exchange for an insignificant OA improvement. The value of PC bands and DII is highly dependent on the statistic of the input samples and image statistics, which is prone to subjectivity. Meanwhile, bathymetry between areas vastly varies, and cannot be used as standard parameter.
3.2. Model Application
The application of RF 14 class benthic habitat model from Kemujan Island to Karimunjawa, Menjangan Besar, and Menjangan Kecil Islands (hereafter, these islands are referred to as test areas) was not very successful. The accuracy of RF in test areas was 48.99%, with healthy coral reefs as the most accurate class with 66.24% and 91.99% UA and PA respectively. The accuracy of other classes such as ThCr (UA 55.13%, PA 55.4%), Ea (UA 49.6%, PA 49.87%), rubble (UA 33.33%, PA 33.59%), intermediate (UA 28.61%, PA 28.85%), and sand (UA 21.38%, PA 27.8%) followed accordingly. The rest of benthic habitat classes had less than 10% of UA and PA. Nevertheless, the result was actually consistent where the shoreline was dominated by seagrass and brown algae, and coral reef was located in reef-crest or reef-cut. However, there were some inconsistencies where healthy coral occurred in the shoreline, and the overestimation of seagrass extent in Menjangan Besar and Menjangan Kecil Islands. These results indicate that:
The classification scheme of benthic habitat is too detailed and creates confusion in the application of models in other areas.
Since not all benthic classes may be present in all areas, it is unclear if the model failed to classify a particular class, i.e., CrHu, or this particular class truly does not exist in the area. In our case, we can confirm that this was as a result of a failure in the model, since our field data indicated that there is CrHu located somewhere in the field.
For general mapping, the scheme needs to be refined and be more universal, to ensure that all classes in the scheme are present in many areas.
To justify our statement, we simplified the scheme and used the major benthic habitat classification scheme; coral reef, seagrass, macroalgae, and “sand and rubble”. All training areas were re-labelled based on these classes. We developed the model by means of RF using deglint bands and produced 94.17% OA. The UAs are 98.58%, 83.13%, 66.69%, and 91.80% for coral reef, seagrass, macroalgae, and “sand and rubble” classes respectively. The PAs are 97.44%, 87.76%, 58.57%, and 93.74% for the classes in the same order. Most classes produced very high accuracies with low misclassifications. Macroalgae class had the lowest accuracy owing to high misclassification rate with sand and rubble, and seagrass.
Afterwards, we applied this RF model to the test areas and yielded 70.93% OA. The UAs for coral reef, seagrass, macroalgae, and “sand and rubble” class are 86.24%, 44.30%, 5.94%, and 37.00% respectively, while the PAs are 91.75%, 20.40%, 13.48%, and 29.64% for the classes in the same order. The accuracy of the simplified RF model produced more accurate results, just as expected. This accuracy is statistically high, especially for rapid mapping, and it is within the acceptable limit for benthic habitat mapping using a scheme that comprises of four benthic classes. The acceptable limit lies within the range of 40%–70% [
1] and >60% based on Indonesian Nasional Standard for Mapping [
57]. Nevertheless, only coral reef class produced an accurate result, while other classes had low accuracies. Seagrass class was highly misclassified as macroalgae and coral reef, macroalgae was highly misclassified as sand and rubble and seagrass. Sand and rubble class was highly misclassified as coral reef and also indicated that the spatial distribution of coral reef was overestimated. Many areas near shoreline, which should have been classified as sand, seagrass, or macroalgae, were misclassified as coral reef.
4. Discussion
4.1. Accuracy Comparison
The goal of this research is to perform benthic habitat mapping using machine-learning algorithms, develop a mapping model from the most accurate result, and lastly, adapt the model to other areas. Machine-learning approach is different from the parametric classification algorithms that utilize training area statistics to generate the centroid of each class cluster, while the remaining pixels are classified accordingly, based on the range of boxes, shortest distance, mahalanobis distance, maximum probabilities, or angle of spectral similarity [
58]. Machine-learning may resolve the issue of non-normal Gaussian distribution of training areas, where in this research is mainly due to the sub-pixel mixing of benthic habitats, different number of training areas between classes, and the possibility of inconsistencies in labeling the sample photos.
The use of CTA for benthic habitats mapping is limited, and a direct comparison could not be performed. SVM is a powerful classifier for benthic habitats mapping as described in previous works [
8,
9,
52]. In this research, the reported accuracy of SVM is higher than other works that have employed less complex classification schemes [
8,
9,
34,
53]. The setback is the failure to properly classify small life-form seagrass classes compared to RF and CTA, and the required processing time. In SVM, there were four classes with zero accuracy, even after we experimented with different settings of SVM parameters in order to obtain more effective classification results and minimizing the misclassification.
It is also rather difficult comparing our result with others since the scheme used in each research is unique. The closest would be [
14] with 13 classes, where he reached an accuracy of 40% using PC bands. However, the scheme does not contain any species differentiation of seagrass or the variation of macroalgae pigments. Our accuracy is considered to be higher, compared to other researches with the same or even lower scheme complexities. Even when using hyperspectral data, an accuracy of over 80% was only obtained for benthic habitat mapping with 3-12 classes complexity [
7]. Recent researches that made use of high spatial resolution image classified less than ten benthic habitats classes, and examples are [
8] with 73% accuracy (four classes), and Reference [
59] employed the Spectral Angle Mapper (SAM) method on CASI hyperspectral image (7 classes, accuracy not reported).
The misclassification pattern for SVM and CTA result is similar, where in average, the user’s and producer’s accuracy of classes containing multiple benthic class (mixed class) are lower than classes containing single benthic class. Meanwhile, in RF result, the average producer’s accuracy of mixed classes is lower but the average user’s accuracy of mixed classes is higher than classes containing single benthic class (see
Table 4,
Table 5 and
Table 6). Therefore, RF algorithm has better performance to correctly classify different benthic habitat compositions compared to SVM and CTA. However, strong misclassification did not only occur for mixed classes, but also in the classes containing single benthic object due to the similarity of object spectral response characteristics between individual benthic classes, overlapping class descriptor between classes, and class association, as explained in
Section 3.1. As a consequence, even the class containing single benthic object can also have high misclassification rate.
The application of PCA did little to improve the accuracy of RF, CTA, and SVM, which is not in accordance with the result obtained by Reference [
14] where the accuracy of PC bands outperformed other inputs such as deglint bands and DII in hierarchical benthic habitats mapping complexity using ML. Another contrary result was obtained by Reference [
34] where the application of SVM to PC bands of Landsat 8 image obtained accuracy higher than that of DII. In fact, the accuracy is relatively constant for all inputs with very low standard deviation. Our results indicated that the incorporation of bathymetry and slope data in the classification input had no significant effect on improving the classification accuracy. Depending on the main input bands and algorithm, adding bathymetry and slope data may or may not improve the accuracy, and this is similar to the result from Reference [
60] where the addition of LIDAR data increased and decreased the classification accuracy depending on the associated input bands and classification algorithm. Reference [
7] indicated that bathymetry is not necessary to improve the accuracy of benthic habitats mapping, but Reference [
8] reported otherwise. This difference may be as a result of the quality of bathymetry data. The bathymetry model of [
8] was generated from a more complex radiative transfer model. Hence, resulting in a more accurate bathymetry map, and provided accuracy improvement on benthic habitat classification result.
Reference [
8] highlighted the importance of water-column correction, therefore the dominant spectral response from underwater pixels sourced solely from benthic habitats, meanwhile, [
7] warned about the difficulties of removing water-column effect from an optically shallow water, due to the need to obtain unique parameters for each spectral band. Additionally, the correction of water-column effect may not always be beneficial. Using ML, [
14] did not manage to obtain accuracy improvement just by applying DII. Since there are several methods for removing the water-column effect, it is important to assess the impact of these different methods on the classification accuracy in other works to come.
In our case, the most accurate CTA and SVM result was from DII. Although the accuracy improvement for CTA was only 2% from deglint bands, DII made a significant impact in resolving the misclassification between coral reef and seagrass. Interestingly, if the results of RF, SVM and CTA from deglint bands, water column corrected bands, and PC bands are compared, the accuracy would not vary much. Even if all data were used at once, the accuracy will still not be significantly different. Hence, machine-learning algorithms may also reflect the maximum descriptive resolution of remote-sensing image to map benthic habitats at this level of complexity.
4.2. Benefits and Setbacks of Machine-Learning Algorithms
Using machine-learning algorithm, it is very possible to include various datasets as classification inputs, whether it is spectral bands or continuous dataset, i.e., bathymetry, slope, distance from shoreline, and categorical dataset, i.e., coral reef geomorphology map, in order to obtain an accurate benthic habitat map. The more data we made use of, the more information can be used in the machine-learning process. We do not have to repeatedly process and classify the image using different scenarios. Thus, machine-learning algorithm may produce a classification result at the maximum image descriptive resolution. In this research, the maximum accuracy is 88.54% (14 class) and 94.17% (4 class), which were derived from RF using Deglint bands. When we consider the various combinations of inputs or all input bands from all algorithms, the difference is insignificant (<5% standard deviation), which means that we can either make use of all the available data, or only limited data such as deglint bands to produce high accuracy. Machine-learning algorithm is powerful enough to exploit the capability of the input bands to their maximum descriptive resolution. This can be observed from the insignificant difference between accuracies of different input bands. Consequently, further processing may not be necessary, given that deglint bands are capable of producing an accurate result. We just need to ensure the correctness of the spatial distribution of benthic habitats by carefully identifying areas with potential misclassifications.
We also maintained the details provided by WV2 spatial resolution (2 m) while obtaining higher classification accuracy. Depending on benthic habitat complexity of the study area, the intended details of benthic habitat map, and the complexity of the classification scheme, per-pixel classification or OBIA may produce more representative benthic habitat spatial distribution. In this research, we preferred to have information at pixel level for benthic habitat mapping, as the information contained in each pixel is unique and the variation of adjacent benthic habitat in the study area is not always constant. There are many small patches of coral reef, seagrass, sand, or macroalgae in between and among on another. Coral reef and the change of adjacent habitat may not be linear, and sometimes exhibit a distinct feature boundary [
61]. This distinct boundary is highly essential in identifying the biodiversity and ecological composition. Generalizing this variation may not be ecologically sound and reduce intra-habitat variations information. Performing majority analysis may also provide a better object distribution and less noise. However, it also decreases the precision provided by the WV2 image, and thus negates the high cost required to purchase the image [
61]. Finally, whether it is per-pixel classification or OBIA, it depends on the in situ object variations that we intend to explain as well as what we are really planning to aim for benthic habitat mapping.
4.3. Machine-Learning Model Performance in Test Area
Using a model developed by a complex classification scheme is not effective when applied to other areas. Only several classes managed to be classified with good accuracy, especially the most dominant class, which is healthy coral reef. In addition, the detailed variation of benthic habitat between areas may not be similar, and thus, there is a possibility that a particular class in the developed model is not present in the applied areas, and this could therefore result in the possibility of having a non-existence class in the applied areas. The use of a more general scheme yielded better accuracy, though some classes still failed to be mapped at high accuracy. Coral reef class remains the class to be mapped with the highest accuracy, while macroalgae is mapped with the lowest.
As regards to the model that can be widely applied, it is important not to use input bands with high variability. For instance, the quality of water column corrected image is highly subjective and dependent of the range of depths and water column attenuation coefficient. PC bands result also relies strongly on the object variation in the scene, and bathymetry vastly varies between areas. As a result, the application of the model may fail. The most standard input would be surface reflectance and deglint bands. However, deglint bands may also introduce variability, depending on the strength of sunglint intensity.
The main goal for developing benthic habitats mapping model is to establish an automatic, accurate and consistent mapping model that can be widely applied. This research has shown that it is possible to perform such a task, although it is still premature to be concluded, since the area used in testing the model performance is having a relatively similar reef type and benthic habitat complexity. We plan to apply the RF model developed in this research to map benthic habitat in different reef types with different benthic habitat complexities. It is also important for us to develop similar models, especially using 3 visible bands, given that this is the most widely available spectral resolution, i.e., IKONOS, Quickbird, Geoeye-1, Skysat, PlanetScope, Rapideye, ALOS AVNIR-2, Sentinel-2, Landsat series, and SPOT series. This will ensure data continuity and possibility to perform historical analysis of benthic habitat.
Finally, refining the classification scheme in order to understand whether the model scheme has satisfactorily answered various management needs, and if obtaining the variations of benthic habitats areas across different geographical areas are necessary. For instance, seagrass class in our study did not include
Thalassodendrom ciliatum (Tc), which is uniquely abundant on Nusa Lembongan Island [
42]. Furthermore, it is essential to identify the balance between the details of benthic habitat classification scheme and the ease of scheme adaption to different areas. If these can be achieved, we can predict a well-detailed benthic habitat map across inaccessible areas, since the requirement to have in situ data to train the classification process is no longer necessary, unless the benthic habitats composition is unique to the area. Obtaining in situ benthic habitat data for validation will then be subject to the management priority.