Towards Benthic Habitat 3D Mapping Using Machine Learning Algorithms and Structures from Motion Photogrammetry

The accurate classification and 3D mapping of benthic habitats in coastal ecosystems are vital for developing management strategies for these valuable shallow water environments. However, both automatic and semiautomatic approaches for deriving ecologically significant information from a towed video camera system are quite limited. In the current study, we demonstrate a semiautomated framework for high-resolution benthic habitat classification and 3D mapping using Structure from Motion and Multi View Stereo (SfM-MVS) algorithms and automated machine learning classifiers. The semiautomatic classification of benthic habitats was performed using several attributes extracted automatically from labeled examples by a human annotator using raw towed video camera image data. The Bagging of Features (BOF), Hue Saturation Value (HSV), and Gray Level Co-occurrence Matrix (GLCM) methods were used to extract these attributes from 3000 images. Three machine learning classifiers (k-nearest neighbor (k-NN), support vector machine (SVM), and bagging (BAG)) were trained by using these attributes, and their outputs were assembled by the fuzzy majority voting (FMV) algorithm. The correctly classified benthic habitat images were then geo-referenced using a differential global positioning system (DGPS). Finally, SfM-MVS techniques used the resulting classified geo-referenced images to produce high spatial resolution digital terrain models and orthophoto mosaics for each category. The framework was tested for the identification and 3D mapping of seven habitats in a portion of the Shiraho area in Japan. These seven habitats were corals (Acropora and Porites), blue corals (H. coerulea), brown algae, blue algae, soft sand, hard sediments (pebble, cobble, and boulders), and seagrass. Using the FMV algorithm, we achieved an overall accuracy of 93.5% in the semiautomatic classification of the seven habitats.


Introduction
Coastal ecosystems are important because they support high levels of biodiversity and primary production; however, their complexity and spatial and/or temporal variability make studying them particularly challenging. Currently, mapping of marine habitats is based mainly on two data sources, acoustic and optical [1,2]. Towed video cameras are cheaper than acoustic backscatter systems [3]. Moreover, improvements in high-resolution video cameras offer the opportunity to make very high-resolution in situ observations over large areas of seabed, coupled with the advantage of simultaneously collecting geographical information for these habitat species [4]. These high-resolution seafloor coverage area, as well as caustics effects [16]. These systems must track a broad spectrum of known and unknown features on the seafloor. However, towed cameras operate in water conditions, as the camera and the features are in the water. As a result, the refraction effects can be corrected by the camera calibration process [17]. Owing to the complexity of the technical conditions, quality, and resolution, there is still a lack of automatic methods for obtaining significant data from seabed images. Semiautomatic or automatic feature classification is not utilized often in marine sciences [18], mainly because the application of suitable algorithms is difficult.
Alternatively, other researchers have studied the classical unsupervised classification of images and the identification of segmented classes using field measurements. For instance, Vassallo et al. [2] proposed complete acoustic coverage of the seafloor with a comparatively low number of sea ground-truth samples to produce benthic habitat maps. A fuzzy c-means clustering unsupervised algorithm was applied to recognize five coralligenous habitats with a set of observations made from field samples collected by scuba divers. The categorized coral features were Cystoseira zosteroides, Axinella polypoides, Eunicella cavolini, Eunicella singularis, and Paramuricea clavata. A total of 57 images were used for training and testing the proposed model. The OA of the classification reached 89%, with a final map scale of 1:25,000. Baumstark et al. [19] classified five benthic habitats (i.e., hard bottom, sand mixed seagrass, seagrass dense, seagrass medium, and seagrass sparse) from a WorldView-2 image using an OBIA approach with unsupervised classification. The accuracy assessment process was performed using 65 random points for all benthic classes. Although the OA was 78%, which is considered to be lower than typical accuracy standards, the authors believe that this accuracy could be improved with additional of ground-truth samples. Conversely, Baumstark et al. [20] presented a completely unsupervised classification method for marine algae identification using airborne hyperspectral images. Their proposed approach estimates the optimal number of classes and the final partition automatically. They mapped three classes of marine algae: brown algae, a substrate (rocks, pebbles, and sand), and green algae. Only 23 ground-truth points were used to assess accuracy. Nevertheless, these unsupervised methods have some demerits, and, accordingly, represent threats. First, they are influenced considerably by data reliability, accuracy, and resolution. Second, sea-truthing samples remain indispensable for the prediction and verification of accuracy. Finally, an inappropriate analysis of the outputs and results from small verification samples can lead to management errors.
SfM-MVS process is a combination of computer vision and photogrammetry that utilizes overlapping images taken at various angles for the accurate construction of 3D models. The main advantage of SfM-MVS is that the geometry of the photographed scene, the camera position, and the interior and exterior orientations can be constructed with only limited ground control [21,22]. Consequently, SfM-MVS is perfectly suited to images acquired by low-cost, nonmetric towed cameras. First, SfM can determine the above parameters simultaneously using a highly redundant and iterative bundle adjustment procedure, which is based on a dataset of invariant features extracted from multiple overlapping images [22,23]. These features are tracked from one image to another, enabling initial estimates of the camera position and object coordinates that are then refined iteratively by means of nonlinear least squares minimization [23,24]. Then, MVS finds correspondences between stereo images and applies regularization in the object space to produce 3D dense point clouds [25,26]. Indeed, SfM-MVSs are considered to be rapid and low-cost tools for producing scaled, 3D digital models and orthophoto mosaics, while also automatically resolving the distortions of underwater refraction. These models allow scientists to study various properties of the benthic community, such as the live surface area, biomass, and colony volume, and also to analyze any changes in these communities over time [27]. The ability to quantify these features will greatly enhance both biological and ecological investigations of coral reef ecosystems.
Recent studies have demonstrated the importance of SfM-MVS as another tool for reef monitoring [27,28], long term monitoring of benthic communities and legacy data rescue [25], monitoring the demography and morphometry of soft corals such as gorgonian species [29], large underwater area 3D reconstruction [30], and bathymetry determination [31][32][33]. Moreover, Burns et al. [22], Figueria et al. [34], and Leon et al. [28] used SfM-MVS to measure multiple metrics of 3D habitat complexity, providing accuracy measurements, and computed rugosities. In addition, Storlazzi et al. [35] proved that SfM-MVS techniques are more effective and more quantitatively powerful than classical methods in characterizing benthic habitats. Therefore, this might be considered the end of the "chain-and-tape" method for measuring benthic complexity. Raoult et al. [36] studied the error limits for coral reef measurements observed at various times and by different observers. Their findings showed that coral reef measurements were consistent between observers and over time and also that photographic coverage is more important than the numbers of pictures.
Moreover, recent studies have attempted to apply SfM-MVS with benthic cover classification to produce categorized 3D benthic cover maps. However, most of these studies created 3D models for the overall area and then applied the classification process using either 3D models or some variables (e.g., slope and rugosity). Ahsan et al. [37] proposed a predictive learning approach for benthic habitat mapping using 3D model features and seabed terrain features. These 3D model features included local binary, modified HSV histograms, and visual rugosity index. Furthermore, the terrain features include the depth, rugosity, slope, aspect, profile curvature, and plan curvature produced from high resolution AUV multibeam bathymetry. Six habitat classes (high relief reefs (two classes), low relief reefs, coarse sand/sand, screw shell rubble/sand, and Ecklonia) were classified over the Tasman Peninsula in Tasmania, Australia. The resulting accuracies of the proposed models were between 0.69% and 0.78% using 10-fold cross-validation. These results demonstrate that some of the classes are being misclassified and the applied bathymetric features were not adequately descriptive to classify these habitats. Price et al. [38] used ROV video records to construct 3D models of cold-water coral reefs at 1000 m depths over a tributary of Whittard Canyon, North East Atlantic. SfM-MVS was applied to generate sub centimeter resolution 3D reconstructions. The resulting digital elevation models were applied to produce rugosity metrics, and the produced orthomosaics were utilized for coral coverage assessment. To assess coral coverage percentage and substrate type, ImageJ macro-code was used to assign 250 points across each orthomosaic. Six habitats were identified: live coral, dead coral, hard rock, mudstone, mixed sediment, and litter. The produced results prove that SfM-MVS can quantify cold-water coral structural complexity and create 3D habitat maps over larger areas. Williams et al. [39] studied combining Simultaneous Localization and Mapping (SLAM) trajectory and stereo pair images from an AUV to create detailed 3D maps of seafloor survey datasets. This combination was used to document benthic habitats at Ningaloo Reef, Western Australia. The resulting composite 3D meshes provided a useful tool for evaluating the scales and distributions of benthic habitat spatial patterns. Pavoni et al. [40] presented several strategies for the improvement of the semantic segmentation of benthic habitats using high resolution orthomosaic maps. Furthermore, to overcome the problem of reduced training datasets produced from a single orthomosaic, a simple oversampling strategy in the dataspace, based on size-and shape-driven cropping of the sample, was proposed. The resulting maximum accuracy was 0.95 when classifying one soft coral digitate class.
However, the abovementioned approaches have several drawbacks: (1) producing 3D models for the overall area is a time consuming and labor intensive process, especially for mapping large study areas; (2) If the number of benthic habitats classes were to increase, the majority of these approaches would result in comparatively low classification accuracy; (3) integrating multibeam bathymetry, which is relatively expensive, with 3D mosaics to improve the results [37] would increase the process costs; and (4) the integrated bathymetric features were not sufficiently descriptive to classify these habitats. Accordingly, the proposed approach attempts to overcome these demerits by classifying high-resolution geo-referenced images using machine learning algorithms. These images can be collected by a simple high-resolution camera that can be towed beneath a small vessel. Subsequently, SfM-MVS techniques can be used to produce fine-scale categorized 3D benthic habitat maps using correctly categorized images.
The scope of this paper is to propose a semiautomated framework for benthic cover classification and 3D mapping. This framework will be able to exploit underwater footage, soft classifiers, and SfM-MVS techniques to produce high-resolution 3D habitat maps of shallow coastal reef systems. A simple but cost-effective towed video camera system was used to collect the geolocated images. Three approaches, the BOF [41,42] technique, HSV [43,44] color features, and GLCM [45,46] texture features were tested to extract attributes from these images for the semiautomatic classification of benthic cover. Moreover, a detailed analysis was conducted to identify the extracted attributes that would best increase the discrimination capability of the classifiers. Next, three soft classifiers, k-NN, SVM, and BAG outputs, were combined into an ensemble using the FMV algorithm for benthic feature classification. Moreover, the OA and the Kappa statistical criteria were used for the evaluation and comparison of benthic cover classification. Finally, SfM-MVS methods were applied to produce 3D benthic cover maps using the resulting correct geolocated images.

Study Area
The study site is located in the Shiraho subtropical region, which is included partly in the southeastern part of Ishigaki Island, Japan (see Figure 1). This is an area rich in marine biodiversity, with shallow, low-turbidity water [47] and a maximum depth of 3.5 m. The Shiraho area has various reefscapes, including complex patches of branching (Acropora) and massive corals (Porites). Moreover, it has a large colony of a Blue Ridge coral (Heliopora coerulea). There are also a wide range of brown algae and blue algae, a variety of geomorphic features (soft sand, cobble, and boulders), and seagrass.
Remote Sens. 2020, 12, x FOR PEER REVIEW 5 of 16 were tested to extract attributes from these images for the semiautomatic classification of benthic cover. Moreover, a detailed analysis was conducted to identify the extracted attributes that would best increase the discrimination capability of the classifiers. Next, three soft classifiers, k-NN, SVM, and BAG outputs, were combined into an ensemble using the FMV algorithm for benthic feature classification. Moreover, the OA and the Kappa statistical criteria were used for the evaluation and comparison of benthic cover classification. Finally, SfM-MVS methods were applied to produce 3D benthic cover maps using the resulting correct geolocated images.

Study Area
The study site is located in the Shiraho subtropical region, which is included partly in the southeastern part of Ishigaki Island, Japan (see Figure 1). This is an area rich in marine biodiversity, with shallow, low-turbidity water [47] and a maximum depth of 3.5 m. The Shiraho area has various reefscapes, including complex patches of branching (Acropora) and massive corals (Porites). Moreover, it has a large colony of a Blue Ridge coral (Heliopora coerulea). There are also a wide range of brown algae and blue algae, a variety of geomorphic features (soft sand, cobble, and boulders), and seagrass.

Benthic Cover Field Data
Benthic cover field data collection began on 21 August 2016 (see Figure 1). Underwater videos were obtained using a low-cost compact high-resolution camcorder (GoPro HERO3 Black Edition, 1440 video resolution, 30 frames per second with a wide field of view). The utilized GoPro HERO3 camera has stable interior orientation parameters (IOPs) [48]. The camcorder was attached beneath the surveying motorboat side just under the surface of the water to observe the shallow seabed. A

Benthic Cover Field Data
Benthic cover field data collection began on 21 August 2016 (see Figure 1). Underwater videos were obtained using a low-cost compact high-resolution camcorder (GoPro HERO3 Black Edition, 1440 video resolution, 30 frames per second with a wide field of view). The utilized GoPro HERO3 camera has stable interior orientation parameters (IOPs) [48]. The camcorder was attached beneath the surveying motorboat side just under the surface of the water to observe the shallow seabed. A series of four hours of video recordings from the survey trip were acquired. These recordings were geolocated using a differential global positioning system (DGPS) system mounted vertically by a wooden stand above the camera. Figure 2 is an illustrative picture to describe the DGPS and the camera positions on the motorboat. A free video to image converter software package was used to extract images from the video files with a one-second image interval and a minimum 60% overlap to be synchronized with the DGPS surveys. The DGPS kinematic observations were post processed using the online Natural Resources Canada Precise Point Positioning (PPP) service (https: //webapp.geod.nrcan.gc.ca/geod/tools-outils/ppp.php), thereby achieving centimeter-level accuracy. To properly locate the camera center points, 55 cm was subtracted in the Z direction, which was the distance between the receiver antenna and the camera position. The resulting positions were used as the camera positions in the 3D model production input process. From the extracted images, 3000 images with known locations (using the DGPS system) were labeled manually for seven classes (see Figure 3).
Remote Sens. 2020, 12, x FOR PEER REVIEW 6 of 16 series of four hours of video recordings from the survey trip were acquired. These recordings were geolocated using a differential global positioning system (DGPS) system mounted vertically by a wooden stand above the camera. Figure 2 is an illustrative picture to describe the DGPS and the camera positions on the motorboat. A free video to image converter software package was used to extract images from the video files with a one-second image interval and a minimum 60% overlap to be synchronized with the DGPS surveys. The DGPS kinematic observations were post processed using the online Natural Resources Canada Precise Point Positioning (PPP) service (https://webapp.geod.nrcan.gc.ca/geod/tools-outils/ppp.php), thereby achieving centimeter-level accuracy. To properly locate the camera center points, 55 cm was subtracted in the Z direction, which was the distance between the receiver antenna and the camera position. The resulting positions were used as the camera positions in the 3D model production input process. From the extracted images, 3000 images with known locations (using the DGPS system) were labeled manually for seven classes (see Figure 3).   series of four hours of video recordings from the survey trip were acquired. These recordings were geolocated using a differential global positioning system (DGPS) system mounted vertically by a wooden stand above the camera. Figure 2 is an illustrative picture to describe the DGPS and the camera positions on the motorboat. A free video to image converter software package was used to extract images from the video files with a one-second image interval and a minimum 60% overlap to be synchronized with the DGPS surveys. The DGPS kinematic observations were post processed using the online Natural Resources Canada Precise Point Positioning (PPP) service (https://webapp.geod.nrcan.gc.ca/geod/tools-outils/ppp.php), thereby achieving centimeter-level accuracy. To properly locate the camera center points, 55 cm was subtracted in the Z direction, which was the distance between the receiver antenna and the camera position. The resulting positions were used as the camera positions in the 3D model production input process. From the extracted images, 3000 images with known locations (using the DGPS system) were labeled manually for seven classes (see Figure 3).

Methodology
The proposed framework for benthic cover classification and 3D mapping over the Shiraho area was established as follows: 1.
All four hours of the video recordings were converted to geolocated images using a free video to JPG converter program with one-second intervals synchronized with the DGPS recorded locations.

2.
A total of 3000 converted images were labeled individually by a human expert according to seven benthic cover categories: corals (Acropora and Porites), blue corals (H. coerulea), brown algae, blue algae, soft sand, hard sediments (pebbles, cobbles, and boulders), and seagrass. 3.
These labeled geolocated images were used as inputs for the BOF, Hue Saturation Value (HSV), and Gray Level Co-occurrence Matrix (GLCM) approaches to create the attributes for the semiautomatic classification. 4.
The extracted attributes produced from the BOF, HSV, and GLCM approaches were used as the inputs for training three machine learning soft classifiers (BAG, SVM, and k-NN), and the image labels were used as the outputs.

5.
The three classifiers were combined with the FMV algorithm to classify the benthic cover categories. 6.
The entire classifier evaluation process was conducted using 2250 independent randomly sampled images (75%) for training and 750 images (25%) for testing. 7.
After the FMV algorithm was trained and validated, it was used to categorize more images, and the resulting images were checked individually. 8.
SfM-MVS techniques were performed to produce 3D mosaics and digital terrain models DTMs for each habitat class using the correctly categorized geolocated JPG images.
The entire benthic cover classification process was applied in the MATLAB environment, with the following described parameters for each method.
For extracting benthic habitat categorization attributes, 26 texture parameters, 256 HSV values, and 250 BOF attributes were extracted from each image using the MATLAB environment (see Table 1). We tested the principal component analysis approach for removing redundancies from the input attributes, but the OA decreased significantly to 75%.
Subsequently, to classify benthic cover, BAG [49,50], SVM [51,52], and k-NN [53,54] soft classifiers were used with the following parameters. The BAG approach assembled 30 classification trees with 24 splits for each tree; the SVM model used a third-order polynomial kernel function; and the k-NN approach had five neighbors for the k-value, used the city block technique for distance calculation, and used the squared inverse distance weight as the distance weighting function method. For each algorithm, these parameters produced the highest OA and Kappa values. Finally, the resulting probabilities from each soft classifier were ensembled using the FMV [55,56] model. Agisoft PhotoScan Professional software was used to produce the 3D categorized DTMs and orthophoto mosaics based on the SfM-MVS approaches. Four steps were used, and the parameters of each step were chosen according to the recommended settings by the software developers and performed trials. The first step, known as photo alignment, uses the post processed camera positions in the input process. This step includes computing the initially estimated IOPs using a scale invariant feature transform matching algorithm. Then, a local reference coordinate system is established to create the initial datum for determining the images' exterior orientation parameters and the 3D coordinates of the matched points [48]. Second, a bundle adjustment procedure is performed to adjust the exterior orientation parameters, and object coordinates are created in the first step [57]. Third, a dense point cloud with ultrahigh quality is generated based on the MVS technique. Then, a mesh is created using the dense point cloud with an arbitrary surface type, thereby enabling interpolation for the final model generation. Fourth, a texture atlas with a generic mapping mode for the model is built and used to generate the orthophoto mosaic. Finally, all the categorized DTMs and orthophoto mosaics produced are exported with a pixel resolution of 5 cm. The proposed methodology for this research was applied in several key procedures, as shown in Figure 4.
Remote Sens. 2020, 12, x FOR PEER REVIEW 8 of 16 each step were chosen according to the recommended settings by the software developers and performed trials. The first step, known as photo alignment, uses the post processed camera positions in the input process. This step includes computing the initially estimated IOPs using a scale invariant feature transform matching algorithm. Then, a local reference coordinate system is established to create the initial datum for determining the images' exterior orientation parameters and the 3D coordinates of the matched points [48]. Second, a bundle adjustment procedure is performed to adjust the exterior orientation parameters, and object coordinates are created in the first step [57]. Third, a dense point cloud with ultrahigh quality is generated based on the MVS technique. Then, a mesh is created using the dense point cloud with an arbitrary surface type, thereby enabling interpolation for the final model generation. Fourth, a texture atlas with a generic mapping mode for the model is built and used to generate the orthophoto mosaic. Finally, all the categorized DTMs and orthophoto mosaics produced are exported with a pixel resolution of 5 cm. The proposed methodology for this research was applied in several key procedures, as shown in Figure 4.  Figure 5 show examples of correct and misclassified benthic cover categories resulting from the proposed FMV algorithm. Additionally, the numbers of the correct species categorized from the k-NN, BAG, and SVM classifiers using different attributes are illustrated in Figure 6. Table 2 summarizes the corresponding OA and Kappa values for k-NN, BAG, SVM, and FMV ensemble classifiers, and Table 3 presents the confusion matrix for the classification of benthic habitats using the FMV ensemble. Figures 7-11 show a sample of the correctly categorized 3D views, orthophoto mosaics, and DTMs for each.  Figure 5 show examples of correct and misclassified benthic cover categories resulting from the proposed FMV algorithm. Additionally, the numbers of the correct species categorized from the k-NN, BAG, and SVM classifiers using different attributes are illustrated in Figure 6. Table 2 summarizes the corresponding OA and Kappa values for k-NN, BAG, SVM, and FMV ensemble classifiers, and Table 3 presents the confusion matrix for the classification of benthic habitats using the FMV ensemble. Figures 7-11 show a sample of the correctly categorized 3D views, orthophoto mosaics, and DTMs for each.

Results
Remote Sens. 2020, 12, x FOR PEER REVIEW 9 of 16 Figure 5. Samples of the correct (green) and misclassified (red) benthic cover images over the Shiraho area.

Discussion
In our study, we investigated the discrimination attributes of various benthic cover features. A comparison was made between 256 HSV visual values, 26 GLCM texture values, and 250 BOF values for benthic cover classification. BOF attributes produced the highest species discrimination accuracy, followed by HSV predictors and the GLCM texture values. Additionally, compared to using any single predictor, assembling three predictor groups improved the accuracy of species classification.
The extracted benthic cover attributes were used as the input for three SVM, BAG, and k-NN supervised classifiers. These supervised classifiers were selected following numerous trials based on the highest OA and Kappa values. Numerous classifiers, such as boosting trees, maximum likelihood, and neural network, were tested for benthic classification but yielded lower OA values. SVM produced significantly better results for all benthic cover species' classification compared to both the k-NN and BAG classifiers. The most challenging part of this process was discriminating between soft sand and sediments or corals and blue corals. The mixture of soft sand and sediments in most benthic cover created confusion for all classifiers. In addition, in most of the benthic cover images, the blue corals and corals were difficult to distinguish visually. In addition, the noise of the images, coupled with the poor lighting and water turbidity, impacted the accuracy of discrimination for all classifiers. Nevertheless, the algae, brown algae, and seagrass species were classified with high accuracy (see Table 3).
Chagoonian et al. [58] proved that soft classifiers are more accurate in coral reef mapping compared to traditional hard classifiers. As a result, together, these classifiers can produce more informative maps, especially in very heterogeneous coral reef environments. Moreover, it has been proven in recent studies (e.g., [59][60][61]) that ensemble approaches outperform single classifiers in benthic cover mapping. The proposed FMV ensemble in our study readily combines the probabilities resulting from BAG, SVM, and k-NN soft classifiers that were trained independently. These soft classifiers resulted in diverse per-class accuracy, caused mainly by the differences in their concepts. The FMV ensemble increased the OA and Kappa values for benthic cover classification from the three base classifiers from approximately 4% and 0.05 to 93.5% and 0.92, respectively. These findings demonstrate that the FMV ensemble produces higher classification accuracy compared to the BAG, SVM, and k-NN classifiers used in benthic cover classification.
In this study, we performed benthic habitat classification semiautomatically using visual, texture, and color attributes and then produced 3D categorized thematic maps using SfM-MVS algorithms. Such 3D maps can be used to measure the physical aspects of corals, such as volume, surface roughness, and the proportion of living and/or dead corals [28][29][30][31][32][33][34][35]. The fine scales targeted in this study significantly improve the spatial resolution of diagnoses and predictions for coral reefs. SfM-MVS techniques were performed successfully to produce categorized 3D models for five benthic Figure 11. 3D perspective view, orthophoto mosaic, and DTM for a categorized seagrass sample.

Discussion
In our study, we investigated the discrimination attributes of various benthic cover features. A comparison was made between 256 HSV visual values, 26 GLCM texture values, and 250 BOF values for benthic cover classification. BOF attributes produced the highest species discrimination accuracy, followed by HSV predictors and the GLCM texture values. Additionally, compared to using any single predictor, assembling three predictor groups improved the accuracy of species classification.
The extracted benthic cover attributes were used as the input for three SVM, BAG, and k-NN supervised classifiers. These supervised classifiers were selected following numerous trials based on the highest OA and Kappa values. Numerous classifiers, such as boosting trees, maximum likelihood, and neural network, were tested for benthic classification but yielded lower OA values. SVM produced significantly better results for all benthic cover species' classification compared to both the k-NN and BAG classifiers. The most challenging part of this process was discriminating between soft sand and sediments or corals and blue corals. The mixture of soft sand and sediments in most benthic cover created confusion for all classifiers. In addition, in most of the benthic cover images, the blue corals and corals were difficult to distinguish visually. In addition, the noise of the images, coupled with the poor lighting and water turbidity, impacted the accuracy of discrimination for all classifiers. Nevertheless, the algae, brown algae, and seagrass species were classified with high accuracy (see Table 3).
Chagoonian et al. [58] proved that soft classifiers are more accurate in coral reef mapping compared to traditional hard classifiers. As a result, together, these classifiers can produce more informative maps, especially in very heterogeneous coral reef environments. Moreover, it has been proven in recent studies (e.g., [59][60][61]) that ensemble approaches outperform single classifiers in benthic cover mapping. The proposed FMV ensemble in our study readily combines the probabilities resulting from BAG, SVM, and k-NN soft classifiers that were trained independently. These soft classifiers resulted in diverse per-class accuracy, caused mainly by the differences in their concepts. The FMV ensemble increased the OA and Kappa values for benthic cover classification from the three base classifiers from approximately 4% and 0.05 to 93.5% and 0.92, respectively. These findings demonstrate that the FMV ensemble produces higher classification accuracy compared to the BAG, SVM, and k-NN classifiers used in benthic cover classification.
In this study, we performed benthic habitat classification semiautomatically using visual, texture, and color attributes and then produced 3D categorized thematic maps using SfM-MVS algorithms. Such 3D maps can be used to measure the physical aspects of corals, such as volume, surface roughness, and the proportion of living and/or dead corals [28][29][30][31][32][33][34][35]. The fine scales targeted in this study significantly improve the spatial resolution of diagnoses and predictions for coral reefs. SfM-MVS techniques were performed successfully to produce categorized 3D models for five benthic habitats; only soft sand and sediments were excluded. However, these techniques failed, in a few cases, to align the overlapped images. These cases included poor quality images with low illumination over turbid areas produced from the towed camera. Furthermore, the most complicated areas were located at seagrass meadows; since seagrass frequently moves via waves and currents, 3D mapping of the seagrass canopy was difficult.
In summary, the proposed benthic cover classification and 3D mapping system has many advantages and involves simpler logistics than current methods. The required geolocated images can be obtained using a low-cost towed camera that can be mounted beneath a small motorboat with a DGPS system. These field surveys do not harm the surrounding environment and can be repeated annually to monitor the health of benthic habitats using the same FMV ensemble. The resulting high-resolution 3D maps can be used to monitor either the bleaching of coral reefs or their spatial and temporal changes. Finally, compared with deep learning algorithms, the present approach requires simple programs with relatively short processing times, labor requirements, and a small number of field images needed to train classifiers. Nevertheless, there are some drawbacks requiring improvement (e.g., the limitation in complex mixed areas or high-turbidity areas and limited shallow areas, which can be processed). Still, these results should encourage further studies on the proposed approach, such as using ROV systems to produce higher-quality images for monitoring deep seafloor areas [30]. Moreover, studying the same proposed approaches with the new Fluid lensing optical multispectral instrument (FluidCam) and Multispectral Imaging Detection and Active Reflectance instrument (MiDAR) developed by NASA [62]. FluidCam can produce 3D multispectral images corrected from refraction for shallow marine environments. Also, MiDAR is an active multispectral sensor that illuminates targets with high-intensity narrowband radiation to produce multispectral images across the ultraviolet, visible, and near-infrared bands. The additional multispectral bands can be used to increase the classification accuracy over the heterogeneous areas. Finally, testing the performance of image enhancement techniques for increasing the towed camera images quality and reducing the turbidity and caustics effects.

Conclusions
The SfM-MVS approaches and machine learning algorithms are effective tools for monitoring coastal areas, particularly vulnerable shallow habitats where species are threatened by climate change and human activity [31]. Recently, more attempts have been performed to apply both techniques for generating high-resolution 3D information. In this study, we applied SfM-MVS and machine learning algorithms to produce high-resolution categorized 3D habitat maps. The framework presented herein was tested for classifying seven species in the Shiraho heterogeneous coastal area of Japan. The framework was constructed as follows: the semiautomatic classification of benthic habitats was based on the BOF, HSV, and GLCM attributes' extraction techniques. Several attributes were extracted and assessed from labeled examples using video images from a towed video camera; these examples were geolocated by the DGPS system. Three soft classifiers, k-NN, SVM, and BAG outputs, were assembled by the FMV algorithm. Our results show that the semiautomatic classification of the seven habitats was produced with an OA of 92.7% using the FMV algorithm. Second, fine-scale 3D products like DTMs and ortho mosaics were generated for the correctly categorized georeferenced images using the SfM-MVS techniques. These products are vital for studying topography, rugosity, and the other structural characteristics of benthic communities. The simplicity of this framework facilitates its repeatability and opens the possibility of generating usable products for a broad range of ecological applications.