Machine Learning Classification and Accuracy Assessment from High-Resolution Images of Coastal Wetlands

: High-resolution images obtained by multispectral cameras mounted on Unmanned Aerial Vehicles (UAVs) are helping to capture the heterogeneity of the environment in images that can be discretized in categories during a classification process. Currently, there is an increasing use of supervised machine learning (ML) classifiers to retrieve accurate results using scarce datasets with samples with non-linear relationships. We compared the accuracies of two ML classifiers using a pixel and object analysis approach in six coastal wetland sites. The results show that the Random Forest (RF) performs better than K-Nearest Neighbors (KNN) algorithm in the classification of pixels and objects and the classification based on pixel analysis is slightly better than the object-based analysis. The agreement between the classifications of objects and pixels is higher in Random Forest. This is likely due to the heterogeneity of the study areas, where pixel-based classifications are most appropriate. In addition, from an ecological perspective, as these wetlands are heterogeneous, the pixel-based classification reflects a more realistic interpretation of plant community distribution.


Introduction
Studying the vegetation structure within plant communities is key for environmental sciences because they play a major role in detecting the environmental change and monitoring the ecological condition [1][2][3][4].Remote sensing has been increasingly used in ecology as a powerful method to monitor large areas using direct measurements of vegetation [5].Satellite imagery is being used successfully to monitor large areas [6][7][8], although the spatial resolution of their images is not always adequate to capture the spectral details of spatially heterogeneous vegetation types, such as grassland, heathland or wetland vegetation.Using high-resolution images in ecology offers the possibility to capture the heterogeneity of an area of interest with variations in plant community cover with a small pixel resolution (≈10 cm) [9][10][11].
The most common tool used to capture these types of images are multispectral cameras mounted on Unmanned Aerial Vehicles (UAVs).UAVs offer a platform for the rapid measurement of relatively large areas by overflying remote or inaccessible locations using non-destructive, cost-effective and near-real-time monitoring routines [10] with the capacity to precisely monitor changes in vegetation at a local scale as well as measuring the spatial distribution of plant communities [11].This supplements intensive surveying of species and also expands the potential application of field survey methods, while accounting for changes in vegetation over time [12].In addition to the applications of multispectral sensors, RGB (Red, Green, Blue) cameras can also capture elevation data using recent improvements in point cloud reconstruction methods, such as Structure from Motion [13], detecting microtopographical variations and generating accurate digital elevation models [14].
Classifying the pixel values of images is a common process in ecological studies using remotely sensed data.This involves collecting field data from a series of plots as an input for training a classification model.The model or classifier algorithm labels pixels according to their spectral similarity to the initial values overlapping the plots, and the final classification reveals patterns of distribution in the image, which can be valuable to support management and conservation decision making [9,15,16].
Traditional supervised classifiers assume the normal distribution of remotely sensed data values [17] from datasets, where the Maximum Likelihood classifier is the most common algorithm used [18].Nevertheless, datasets can hide complex and non-linear relations that vary in space; thus, classifiers do not fit into classical statistical methods [19,20].
Machine learning (ML) has emerged as a field of artificial intelligence that deals with complex reference data to build a classifier based on data-driven decisions.For this reason, studies in remote sensing have been increasingly using different ML algorithms because they can outperform traditional classifiers in ecology and Earth science applications and they are fundamentally non-parametric [20,21].
From the ML algorithms evaluated for classifying remote sensing images, the most commonly used are Support Vector Machines, Random Forest and boosted Decision Trees, although there is no general agreement on which algorithm performs better because this depends on the training data, explanatory variables and number of categories to be classified [22].The K-Nearest Neighbors (KNN) classifier does not train a model but computes a comparison of an unknown sample to the training data and is also a non-parametric classifier [23].This algorithm is widely used for its capacity to predict multiple response variables simultaneously by finding the best K parameter [24].Random Forest (RF) is a robust algorithm, which is not overly affected by parameter settings and its classification accuracy does not decrease significantly when using small training datasets [25].
Typically, classification in remote sensing is a process that directly uses pixels; nevertheless, in recent years, some studies have addressed the task of classification with a prior process of segmentation, obtaining groups of pixels, called objects, with homogeneous spectral characteristics and non-spectral attributes, such as shape, relationship and adjacency metrics [26,27].This approach, called object-based image analysis (OBIA), shows some advantages over the pixel-based image analysis (PBIA), particularly where objects have a spatial context and represent homogeneous patches that can represent plant communities larger than the data pixel size [27,28].
In order to compare the accuracy of classifications using OBIA and PBIA approaches of spectrally similar and spatially heterogeneous categories, we tested two ML classification algorithms on plant community data from coastal wetlands.Plant communities in such ecosystems consist of patches of vegetation, where hydrology, topography and human management are often the main factors that influence their distribution [29], resulting in a heterogeneous distribution.Thus far, there are studies comparing the two different approaches in wetlands but using a coarser image resolution [30,31].Thus, the main focus of this study is to perform a classification of high-resolution images from a UAV platform based on two ML algorithms using both the OBIA and PBIA approaches.
The main objectives are to: (1) classify plant communities in coastal wetlands using two different ML algorithms from high-resolution UAV images; (2) compare the classification accuracies of pixels and objects; (3) perform a quality assessment between each classification map.

Study Areas
This study was undertaken in six coastal wet grasslands of Estonia, which belong to the classification 'Boreal Baltic coastal meadows' according to Annex I of the EU Habitats Directive (1992).These coastal wetlands are located in an extensive transitional zone from the sea to terrestrial ecosystems, characterized by a gradient of hydrological conditions and different inundation that largely determines the distribution of the plant communities [32,33].Several studies have shown that microtopography also has an important influence on the location and extent of plant communities [34], including in Estonian coastal wetlands, where the total elevation differences are typically between 0 and 3 m, the tidal variation is negligible (0.02 m) and the range of plant communities is maintained through low-intensity grazing [35,36].
Figure 1 shows the location of the six study sites on the west coast of Estonia within nature protection areas and under low-intensity land management regimes [33,37].Kudani (KUD), Tahu North (TAN) and Tahu South (TAS) are within the Silma Nature Reserve; Ralby (RAL) and Rumpo East (RUE) are within the Vormsi Landscape Protection Area; and Matsalu02 (MA2) is within the Matsalu National Park.The reason of choosing these study areas were due to the aim of classifying different structures of plant communities in coastal wetlands with different grades of heterogeneity and microtopography.The study areas were digitalized manually, excluding the forests.The plant communities present in the study areas are categorized into five classes (Table 1)Table 1. Plant communities sampled in each study area, elevation range and area.LS: Lower Shore; OP: Open Pioneer; RS: Reed Swamp; TG: Tall Grassland; US: Upper Shore.according to Burnside et al. [38].[38].Five plant communities (categories) were identified: Lower Shore (LS), Open Pioneer (OP), Reed Swamp (RS), Tall Grassland (TG) and Upper Shore (UP).As seen in Table 1, not all the plant community classes are present in all study areas.In total, 140 quadrats of 1 m 2 were surveyed using a stratified random approach (ten quadrats per plant community at each site) [35].The X, Y and Z coordinates of the plots were recorded within all quadrats using a Sokkia GSR2700 ISX differential global positioning system (dGPS).Points were recorded in the corners and center of all quadrats, five points per quadrat, as seen in Ward et al. [36].This allowed us to record the location and elevation of the polygons for the training areas within each study site.

Image Acquisition
Two UAV flights were undertaken in each study site with an eBee Plus fixed wing drone, one with a Parrot Sequoia 1.2 megapixel monochromatic multi-spectral sensor, the other with a senseFly S.O.D.A.The Parrot Sequoia camera records data from four bands (Green 550 nm@40 nm, Red 660@40 nm, Red Edge 735@10 nm and Near Infrared 790@40 nm).The camera was calibrated prior to the flights using the Airinov calibration panel.In addition, this camera includes a sunshine sensor mounted on the top of the UAV to record sun irradiance for every shot.This retrieves the reflectance values on images.Flight heights were 120 m above mean sea level and the pixel resolution was 10 cm in all the images.
The flight using the photogrammetry camera senseFly S.O.D.A. reached a height of 123 m above sea level, recording RGB images at a resolution of 3.5 cm per pixel.
All flights were carried out during the summer period (Table 2), as inundation is minimal, so the presence of water does not substantially affect reflectance values of images [39] and the phenology provides optimal reflectance values [40].RINEX observation and navigation files were used to carry out post-processed kinematic (PPK) correction of the multispectral images to improve the positional accuracy [41], using ESTPOS Estonian GNSS-RTK permanent stations network (Eesti Maa-amet) as a reference.There were 7615 images geographically positioned using the Estonian Coordinate System of 1997 (EPSG: 3301) and the PPK retrieved an accuracy value under 7 cm accuracy using the software eMotion 3 ® .The resulting orthomosaics were used as an input for the Pix4D v.4.3.31® software to obtain five orthomosaics per study area [42].
The accuracy of the PPK corrections was assessed using Root Mean Square Error (RMSE) and Mean Absolute Error (MAE) calculations.RMSE and MAE were used to estimate the differences between the ground control points (GCPs) location in the images and the independent GCPs locations measured with the dGPS [43]

Digital Elevation Models
The process known as Structure from Motion (SfM), described in Westaby et al. [13], is summarized in Figure 2.This process retrieves an accurate digital elevation model (DEM) that is able to represent the microtopographic elevation values that influence the distribution of plant communities [29].The main objective of the SfM is to find matching locations in the multiple images to extract the photogrammetric points.Pix4Dmapper software is used with the Multi-View stereo photogrammetry module (SfM-MVS) to find common features within all the images [44], as seen in Step 1 (Figure 2).In Step 2, a bundler adjustment where a low-density point cloud was extracted from the common points using the view stereo photogrammetry, followed by its densification, where an enhanced 3D point cloud is built from different camera positions [44].Finally, in Step 3, the point cloud is classified in two categories: ground and non-ground, using the cloth simulation filtering.This technique is based on the interaction between a cloth model and the inverted point cloud to select the shape of that cloth model [45].This last process is computed in CloudCompare software.Only two parameters had to be set, the "rigidness" and the "ST", in order to control how the cloth model lies over the inverted point cloud.The final DEM is extracted by performing an interpolation from the filtered point cloud (Figure 2).After processing, the RMSE was between 5 and 18 cm using the dGPS measurements.
Each DEM was resampled to the pixel size of the spectral bands in each study area, using the Nearest Neighbor method.

Vegetation Indices
Calculating vegetation indices from the original bands allows the retrieval of vegetation properties in the images by sharpening the differences of the spectral values of plants while avoiding interferences of soil and vegetation components without a photosynthetic response [46].Furthermore, the vegetation indices have been shown to be reliable for monitoring and detecting types of vegetation due to their correlation to biophysical properties in plants [47,48].We calculated ten vegetation indices (Table 3), which were used as input images for the segmentation and classification processes, as they can improve the classification accuracy [46].
We performed band calculations to obtain the vegetation indices using the open source software R (v4.03) [49] with the packages raster [50] and rgdal [51].Green Normalized Difference Vegetation Index Red edge triangular vegetation index (core only) Chlorophyll Index (red edge) = − 1 [58] Red edge normalized difference vege- Modified Green Red Vegetation Index = (( ) − ( )) (( ) + ( )) [61] The indices in Table 3 have been shown to be sensitive to the chlorophyll content in different scenarios where plant and soil structures interact with the reflection of leaves as well as differentiating plant species.Additionally, the red edge band is more sensitive to chlorophyll and the presence of water, and may discriminate the plant communities according to their wetness [62].

Classification of Images
We performed an image classification using two different approaches: Pixel-Based Image Analysis (PBIA) and Object-Based Image Analysis (OBIA).The former has been used in since the first Landsat satellite took the first remotely sensed images, where coarse spatial resolution images are classified, and a single pixel is big enough to represent an object of interest.With the recent increasing resolution of satellite images and the use of UAV to record high-resolution images, single pixels may not always represent meaningful units, but a group of pixels as objects can do so.In OBIA, an image segmentation is performed to create a homogeneous group of pixels [63] and represent the objects for classification.In supervised classifications, an unknown sample (pixel or object) is assigned to a class based on an original training dataset, from an n-dimensional space (feature space) of explanatory variables.These variables are the bands extracted from spectral measurements, combinations of them or ancillary geographical data [64].
Figure 3 shows the workflow to classify the images using the PBIA and OBIA approaches.input for an image segmentation together with the DEM.Finally, a supervised classification was performed using the training samples in raster format.For the PBIA approach, the vegetation indices and DEM were used to classify the image pixels using the training samples in vector format.

Segmentation
In the OBIA approach, image segmentation is the first step to create objects for classification.The segments are discrete units of spatial entities composed of homogeneous values in a multidimensional feature space [26].In order to obtain optimal segmentation parameters for a considerable classification accuracy, previous studies have justified their choices based on trial-and-error approaches and visualizations [27].Instead of following this process, we chose to use a Genetic Algorithm (GA) (Figure 3).This type of algorithm imitates the evolutionary process to obtain a fitness function, which finds the optimal solution to solve a complex computational problem [65].It has been shown that the Genetic Algorithms improve the classification accuracy for a large number of variables (bands) [66].
To carry out the segmentation of images, we chose the SegOptim package written in R [67].This package integrates a complete workflow for segmentation and classification of images.First, the R package uses the functions of third-party software to perform the image segmentation after tuning the parameters for the segmentation by applying a GA.
In this study, we chose the Large-scale Mean-shift algorithm [68] for the following reasons: it is executed in SegOptim by calling the Orfeo Toolbox library [69], (also opensource software); the tuning parameters consist of setting the spectral and the spatial range of pixels in the feature space and establishes the minimum size of the segments; it performs a stable and efficient segmentation of images as it splits the image into tiles.The Spectral Range indicates the search radius in multispectral space; Spatial Range, the similarity of neighbors; and the minimum size of segments in pixel units.
The input variables for the GA were the vegetation indices (Table 3), the DEM and the training samples.The training samples, originally in vector format, were rasterized, as SegOptim only accepts the training samples as a grid of category values.The mean size of the training areas in the raster format was 11 pixels.Each raster is then assigned to the segment based on a threshold value.Due to the small size of the training areas, this threshold was set to 5%, meaning that the segments covered by 5% or more by the training pixels were considered as valid cases for the classification, characterized by the mean and standard deviation of the values of variables [55].
Finally, the fitness value in SegOptim retrieved the best parameters to apply to the segmentation in each study area for the following classification using RF or KNN.The result was a segmented image in raster format (Figure 3).
In order to see if there were differences between the training areas as segments (valid cases for training the classifiers) and polygon features, we compared the mean values of each variable (input vegetation indices and DEM).To do so, we carried out a Kruskal-Wallis non-parametric test between the medians of the training samples and segments with the characteristics of the training samples in each plant community because the values did not follow a normal distribution.

ML Classifiers
To perform a PBIA and an OBIA, we chose the RF and KNN classifiers (Figure 3).RF is a nonparametric ensemble classifier (multiple classifications on the same data) that grows multiple decision trees from a subset of training samples with replacement (bagging) [70].The group of trees make a final decision based on the most repeated vote (modal category) and label the final category to an unknown sample.While the forest is being built, the samples not used for training are the out of bag (OOB) fraction, which estimates the internal performance with an error rate [70].This OOB is also used in the calculation of the variable importance by randomly replacing one input variable in the out of bag fraction by another variable randomly selected.If there is an increase in the OOB error, the variable was important to reduce the error, and thus, it is considered more important in the tree [71].
The KNN algorithm is a nonparametric classifier, which assigns a class label to an unknown sample based on its statistical distance to the training samples.The Euclidean distance is used by default in the caret package applying the KNN method.Once the distance is calculated, it selects the K-Nearest Neighbors to assign the modal category in those k training samples.We searched for the best accuracy between 1 and 20 K-Nearest Neighbors.These minimum and maximum k values were set initially between 1 and 26, as this resulted from the average k of all the study areas after calculating the square root of n (n = number of pixels in training samples) [72].However, in all cases, after k = 20, the accuracy decreased.
For a PBIA classification, we used the caret package [73], which can perform an RF and KNN classification by changing the arguments in the required functions (method = "rf" or "knn", respectively).This function requires the input variables in grid format to be stacked in a multiband raster and use the training areas as vector format (polygons).
For the OBIA classification, the output segmentation obtained in the previous step was used to classify the objects.Prior to classification, a calibration phase was undertaken, where the category values of the training samples are assigned to the segments containing those pixels, based on a threshold value.This classification, as mentioned before, was carried out using the SegOptim package.
We carried out four classifications in each study area: RF and KNN based on PBIA and RF and KNN based on OBIA.

Classification Accuracy and Variable Importance
We consider the accuracy as the agreement between a classified value previously unknown and the training samples, accepted as true values [74].The classifications were evaluated using the User accuracy, Producer accuracy, overall accuracy and Kappa statistic with a 10-fold cross-validation to construct the confusion matrices for each classification.For RF classification based on PBIA and OBIA, we extracted the Mean Decrease Gini (MDG) to report the variable importance in each study area.The MDG is based on the Gini impurity criterion, which calculates the chance to misclassify a new sample.Replacing the purest variable in a node with another one will increase the chance to classify incorrectly the new sample, so that the previous variable was more important (pure) to split the node [25].The higher the MDG is, the more important the variable is.

Map Comparisons
A common task in remote sensing is to perform a comparison between classifications in order to assess the differences in thematic maps [75].Despite the widely used parameters derived from kappa statistics to compare categorical maps [76,77], called Khisto, Klocation and Kquantity, these parameters are not sufficient to compare accuracies as they have a randomness component [78].To overcome these limitations, within this study, a new measure of overall agreement is used as a simpler measurement.It is composed of two metrics, quantity and allocation disagreement.The first component refers to the proportion of differences of unmatching categories and the second one to the differences due to the displacement of categories in the area [78].These comparisons are performed between maps with the same pixel size.
The ML algorithm with the highest classification accuracy was chosen as the reference map.Comparisons were performed between PBIA and OBIA of the best classifier.We reported the percentage of agreement and disagreement areas between maps and the overall percentages of quantity and allocation agreement and disagreement relative to each study area using the Rsenal [79] package in R software [49].The variation in the quantity of each category (plant communities) refers to the differences in their composition throughout the area and the variation in allocation, to the differences in their spatial configuration [80].This allowed the interpretation, in a spatial context, of the differences among the classifications.We also calculated the changes of categories within the disagreement areas.
The remaining percentage (the sum of quantity and allocation agreement and disagreement is not 100%) corresponds to the proportion of agreement that is expected to occur by chance (chance agreement).

Segmentation and Comparison between Training Areas
The GA retrieved the best solution between given maximum and minimum values for each classifier as shown in Table 4.This creates segments according to a search radius based on these values.On average, the size of the training samples in RUE, KUD and RAL are 8 pixels per quadrat, 9 in MA2 and 5 in TAS and TAN, although the category of OP has larger training areas in TAS, TAN and RUE, which are 21, 40 and 42, respectively.The Kruskal-Wallis test did not show any differences between these segments and the initial training polygons.

ML Accuracy Assesment
Figure 4 shows the kappa indices and accuracy scores of the classifications obtained in each study area.The kappa statistic in all the classifiers was greater than 0.7, far from a random classification, and all accuracies reached at least 80%, except the KNN classifier in TAS using OBIA (Figure 4).The classified maps are shown in Figure A1.PBIA approach generally performed better than OBIA and RF algorithm performed better than the KNN algorithm.KUD and MA2 are the only two sites where the OBIA approach performs better than PBIA using the RF algorithm and KUD with KNN algorithm (Figure 4).According to the MDG, the study areas of KUD and MA2 have only one important variable in common (datt4 and DEM respectively) to build the RF classification model (Figure 5).The first eight variables that are more important in the RF classification using PBIA and OBIA are DEM, CVI, MGRVI, Datt4, CCCI, NDVI, GNDVI and RTVIcore (Figure 5).The number of nearest neighbors (k) is higher in OBIA (Figure 5), probably due to the more homogeneous characteristics of the segments, which is why the classifier chose a larger distance for the training samples than in PBIA.

Map Comparisons
As RF was the most accurate classifier, we used it as the reference for the comparisons.
The highest agreement between classifications is the RF classification OBIA and PBIA, where the areas of agreement are greater than 65% of the total area (Figure 6).This means that RF has classified the images in a way that pixels follow similar patterns to the segments.In the classified maps of RF using OBIA and PBIA, both maps have a similar spatial configuration (Figure A1).Nevertheless, the quantity of categories is different, mainly due to the difference of category OP to US, which represents 27% of the area of disagreement.On the contrary, the site TAS has the highest allocation disagreement in relation to the quantity disagreement (Figure 6).The categories that were mixed in the disagreement areas in the study areas the most were: KUD (38% from US to TG), RAL (45% from TG to US), RUE (27% from OP to US) and TAN (42% from LS to US and 37% from US to LS).In TAS and MA2, the mixture is due to more than two categories (Figure 6).
The RF and KNN PBIA classifications exhibited the greatest extent of disagreement areas.In this instance, the main changes were between the categories of TG and US, except in TAN, where changes were between LS and US.Allocation was generally higher than the quantification disagreement between these two classifiers.The agreement areas between RF and KNN OBIA is better than the PBIA, possibly because pixels and segments share the same area in both classifications (Figure 6).

Discussion
The use of high-resolution images to classify plant communities in wetlands has been previously undertaken in several studies using either commercial satellites (<4 m resolution) [81][82][83] or UAVs images (<0.1 m resolution) for environmental assessment [10,84].Classification of plant communities in coastal wetlands, where individual plants are less than 1 m wide (e.g., coastal grasslands, floodplain grasslands, saltmarshes and seagrasses) benefits from the use of very high resolution data to reveal as much spectral variability as possible [38,42].
The use of ML algorithms in remote sensing has provided a solution for complex classifications, using additional explanatory variables apart from spectral values to build classification models, such as vegetation indices or DEMs [22,42,84].These algorithms can also classify categorical units in a more complex feature space, where the spectral separability is low due to the higher variability within classes [85].
Classification using the RF algorithm was better than KNN, both in PBIA and OBIA, in all the study areas except in RAL, where KNN using PBIA showed a higher accuracy and kappa index (Figure 4).These results show that using ML algorithms is appropriate for these types of images, but RF produces more robust results than KNN in coastal wetland plant communities.The reason is that RF builds a classification model using different response variables and extracts the variable importance based on the OOB error from a bagging method to assess the main contribution of each variable [22,71,86,87].The KNN classifier, while a robust algorithm [24,88], does not perform so well in complex feature spaces while calculating the nearest neighbors [22,89].This explains the larger areas of disagreement when KNN and RF classifications are compared (Figure 6).The outputs of distribution areas in the RF classifier match the complex mosaic of plant communities observed on-site, due to transitions from one plant community to another, disturbances or restoration activities (Figure A2).The comparison between OBIA and PBIA in RF shows that disagreement areas are also located in heterogeneous locations where segments do not represent the heterogeneity of the plant communities.The disagreement areas are due to a different spatial configuration of the categorized plant communities and to the "salt and pepper" effect, as described in high-resolution image classification [27].
According to the variables used, the MDG results from RF classifications show that the most repeated important variables in both approaches are the DEM, MGRVI and CVI (Figure 5).As shown in the literature, microtopography significantly influences the distribution of plant communities in coastal wetlands [2,29].On the other hand, the two vegetation indices are shown to have a high potential discrimination in the RGB spectrum (MGRVI [61]) and in the NIR spectrum (CVI [54]) related to leaf chlorophyll content.
Selecting the most suitable segmentation for an OBIA classification can be a complex task due to the low spectral contrast between plant species in coastal wetlands [90].In this study, we tried to avoid this problem by using a GA (Figure 3), as this selects the best parameters to provide the most accurate classification (Table 4).In addition, we used the DEM and vegetation indices instead of spectral bands, as they enhance differences between plant communities [42,46] for segmentations and classifications.The segments re-quired a low variability of input bands as explanatory variables ("spectral range" as parameter) and size parameters to extract meaningful units similar to the training samples.Therefore, the segments used for training do not differ significantly from the original training samples.This was because we plotted small samples in the field in order to get the purest training samples.Nevertheless, this was suitable to obtain high accuracy classifications with an OBIA approach.
PBIA performed better than the OBIA classification in most of the study areas except for KUD and MA2 in the RF classifier and KUD in the KNN classifier (Figure 4).In these areas, there are greater cattle grazing pressures (though still at a low intensity) in than the other sites [37], which may create larger patches of homogeneous plant communities in terms of the reflectance response.In this instance, the segments would have grouped pixels that are more homogeneous, in part explaining the marginally better performance of the OBIA classification in this site.Nevertheless, this is not the case for MA2.This area has a homogeneous distribution of plant communities, clearly influenced by the limited variation in elevation.Unlike KUD, MA2 objects are mixed together.The results in panel g of Figure 5 show the use of a greater number of neighbors in the KNN classifier using the OBIA approach than in the PBIA approach, due to the need of longer statistical distances between categories to classify the segments.
There are disagreements between the classifiers due to similarities in the plant communities, mainly LS, TG and US (Figure A3).These categories of plant communities belong to a broader grassland classification [38], where the spectral values might be similar in spite of the influence of the microtopography (DEM).This study does not look for the reasons why these differences occur from a physiological perspective and the authors suggest further research using different band combinations in order to retrieve different vegetation indices and to see if this also occurs in ecotones of different ecosystems.
Most studies suggest that the use of OBIA extracts the meaningful spectral information of images by reducing the "salt and pepper" effect after classifying high-resolution images and also considers the spectral and spatial information of features when grouped into objects after a segmentation [27,63].Nevertheless, classification accuracies using an OBIA approach depend on the spectral properties and their sizes according to the research goals [81].This study shows better results with pixel classification (PBIA).A previous work chose the use of PBIA rather than OBIA for classifying high-resolution images taken from UAV in coastal wetlands, because its performance could decrease the spectral variability in the images [42].Further work could prove lower classification accuracies by increasing the sizes of the training areas and, thus, the segments, as they homogenize larger areas.
It is important to note that uncertainties generated during the image processing are not addressed in this paper.However, it should be appreciated that the plant communities studied describe a gradient of change, not discrete units as seen in the classifications (Figure A1); thus, this is one important source of uncertainty.In addition, the extraction and resampling of DEM to the original spectral bands, the calculation of indices and the algorithms themselves are also a source of uncertainty that may affect the reliability of our classifications [91].
After using ML classifiers, we obtained accurate results for categorizing the plant communities in a heterogeneous area within different study areas.Therefore, we recommend their use for characterizing the vegetation structure not only in coastal wetlands but in similar ecosystems with mosaic patches of plant communities, including small plants, such as grasses, forbs, tidal grasslands and floodplains.PBIA classifications are better in situations when the vegetation cannot be grouped into meaningful objects.Further work needs to be done to show that OBIA really decreases accuracy as the sizes of training areas and, thus, the segments increase in coastal wetlands and ecosystems with similar characteristics where vegetation is highly heterogeneous.
According to our results, RF provides a robust and highly accurate classifier as well as working well with small training samples.RF also provides the variable importance scores that allow us to know which variable contributes to a better classification.This will help future research to study changes in plant community composition over time using the most important variables for categorizing vegetation.The possibility of carrying out near-real-time monitoring routines with UAVs allows us to acquire images over different periods and to classify them with a robust classifier such as Random Forest for producing thematic maps [42].
Performing a rapid mapping assessment of vegetation is essential to provide quick decisions in environmental management and conservation.The use of a rapid methodology to obtain high-resolution images and their classification using Machine Learning algorithms retrieves high accuracies, which can be used to monitor environmental changes over time and to compare different sites at the same time if we keep parameters constant for the classification and image acquisition.Further studies could use more training samples and categories to see whether the classifications of wetlands decrease due to a more complex feature space of the values in a pixel-based approach.In addition, the spatial variability of categories at the present scale of study could be compared to smaller scales using satellite-mounted sensors to perform upscaling of the images.

Conclusions
The use of ML algorithms is valuable to classify high-resolution images when the composition of study areas is complex.In this study, we have shown that RF and KNN classifications are accurate and robust when using vegetation indices and digital elevation models, but RF retrieved better results when classifying plant communities in coastal wetlands.In spite of the high accuracies in both PBIA and OBIA classifications, our results show that object-based classifications perform slightly worse than pixel-based approaches, because these ecosystems exhibit a high variability when using high-resolution images and grouping pixels masks some of that variability.PBIA is more suitable for classifying high-resolution images in coastal wetlands, where the goal is to show the variability in the study area.It could be useful to use OBIA as a post-process, such as generalizing plant community patterns or when larger training samples are available, allowing us to perform a segmentation using higher thresholds.
As shown in Figure 6, RF retrieves lower scores of disagreement between pixel and object classifications.Nevertheless, this depends on the study area and similarities between plant communities.It would be possible to improve the agreement between the RF classification of objects and pixels using images taken in flights from other dates or using other vegetation indices to discriminate other plant characteristics than those used in this study.

Figure 2 .
Figure 2. Workflow for the structure from motion technique to generate high-resolution digital elevation models from photogrammetric images.In each step, the software used in this study is included.

Figure 3 .
Figure 3. Workflow to classify the images with OBIA and PBIA.The original spectral bands were used to calculate the vegetation indices.For an OBIA approach, vegetation indices were used as

Figure 5 .
Figure 5. MDG scores in each study area.On the left, RF PBIA and on the right, RF OBIA.Study areas: (a) Kudani, (b) Ralby, (c) Rumpo East, (d) Tahu South, (e) Tahu North and (f) Matsalu02.The number of K-Nearest Neighbors used by the KNN classifier in PBIA and OBIA is shown in panel (g).

Figure 6 .
Figure 6.Comparisons between classifiers using the parameters of Quantity Disagreement, Allocation Disagreement, Allocation Agreement and Quantity Agreement (left) and agreement and disagreement areas (right).(a) RF OBIA and PBIA, (b) RF and KNN PBIA, (c) RF and KNN OBIA.

Table 1 .
Plant communities sampled in each study area, elevation range and area.LS: Lower Shore; OP: Open Pioneer; RS: Reed Swamp; TG: Tall Grassland; US: Upper Shore.

Table 2 .
Flight dates in each study area.

Table 3 .
List of ten vegetation indices calculated to improve the classification accuracy in this study.G: Green band; R: Red band; Rre: Red Edge band; NIR: Near Infrared band.

Table 4 .
Best parameters for segmentations after applying the GA for each classifier (Best RF and Best KNN) and mean size of segments (pixels) for each classifier in the study areas.The parameters are Spectral Range (red, reflectance units), Spatial Range (brown, in meters) and Minimum Size (green, pixels).