Machine Learning Vegetation Filtering of Coastal Cliff and Bluff Point Clouds

: Coastal cliffs erode in response to short-and long-term environmental changes, but predicting these changes continues to be a challenge. In addition to a chronic lack of data on the cliff face, vegetation presence and growth can bias our erosion measurements and limit our ability to detect geomorphic erosion by obscuring the cliff face. This paper builds on past research segmenting vegetation in three-band red, green, blue (RGB) imagery and presents two approaches to segmenting and filtering vegetation from the bare cliff face in dense point clouds constructed from RGB images and structure-from-motion (SfM) software. Vegetation indices were computed from previously published research and their utility in segmenting vegetation from bare cliff face was compared against machine learning (ML) models for point cloud segmentation. Results demonstrate that, while existing vegetation indices and ML models are both capable of segmenting vegetation and bare cliff face sediments, ML models can be more efficient and robust across different growing seasons. ML model accuracy quickly reached an asymptote with only two layers and RGB images only (i.e., no vegetation indices), suggesting that these more parsimonious models may be more robust to a range of environmental conditions than existing vegetation indices which vary substantially from one growing season to another with changes in vegetation phenology


Introduction
Coastal cliffs and bluffs erode in response to a complex interaction of abiotic, biotic, and anthropogenic factors across a range of space and time scales.Our ability to accurately predict future change in coastal cliffs is, at least partially, predicated on our ability to understand how the cliff has changed in the past in response to these natural and anthropogenic factors [1].Coastal change along low-relief barrier islands, beaches, and some coastal dunes can be well represented through historical and modern imagery [2][3][4][5], LIDAR surveys [6][7][8][9], and near-nadir (i.e., down-looking toward the Earth surface) structure from motion with multiview stereo (SfM) [5,[8][9][10][11].
In contrast, high-relief coasts remain a data poor environment and are only partially represented near-nadir data sources and surveys.Because imagery and LIDAR are frequently collected at or near nadir, they either partially or completely fail to represent vertical or near-vertical surfaces, such as coastal cliffs and bluffs, accurately [12,13].Terrestrial LI-DAR scanning (TLS) and oblique photogrammetry (aircraft, UAV, or land/water-based) do provide the ability to remotely capture data on vertical or near-vertical cliff faces and may be useful monitoring tools for cliff face erosion and talus deposition [13][14][15][16][17][18][19].Measuring geomorphic change along the cliff face is challenging where vegetation is present.
Vegetation growing on the cliff can obscure the cliff surface (Figure 1) and generally limit our ability to monitor localized erosion patterns that can contribute to large-scale slope destabilization [15].Localized patterns can precede larger cliff failures [20], highlighting the importance of closely monitoring the cliff face for geomorphic changes without interference from any vegetation that may be present.Although tree trunks have been used to track landslide erosion [21], cliffs present a unique challenge as they may either lack distinct highlighting the importance of closely monitoring the cliff face for geomorphic changes without interference from any vegetation that may be present.Although tree trunks have been used to track landslide erosion [21], cliffs present a unique challenge as they may either lack distinct objects that can be tracked and used to measure change, or such objects may rest in a relatively stable position at the cliff base (Figure 1).In addition to obscuring our view of the cliff face, vegetation may also affect the coastal cliff stability [22], and it may be important to identify where vegetation is present and where it is absent.Given the importance of monitoring coastal cliff erosion in the presence of vegetation, it is important that we can efficiently and accurately segment bare-Earth points on the cliff face from vegetation so we can monitor the cliff face for potential leading indicators of slope instability, such as groundwater seepage faces, basal notching, and rock fractures.An added challenge with coastal cliff monitoring is our digital representation of highrelief environments.Digital elevation models (DEMs) or digital surface models (DSMs) are often used to characterize landscape morphology and change over time.However, these 2D representations of the landscape oversimplify vertical and near-vertical slopes and may completely fail to represent any oversteepened or overhanging slopes where the cliff face or edge extends farther seaward than the rest of the cliff face.Furthermore, previous research demonstrates that vertical uncertainty with 2D landscape rasters increases as the landscape slope increases [12].Triangular irregular networks (TINs) represent a more nuanced approach to representing cliffs, although their analysis becomes significantly more challenging because each triangular face has an irregular shape and size, and they tend to be much larger files.Because coastal cliffs are very steep and may be overhanging, a more realistic approach to representing and analyzing LIDAR, TLS, or SfMderived cliff data is to deal directly with the colorized point cloud.Since point clouds often only consist of red, green, and blue (RGB) reflectance bands and do not contain a nearinfrared (NIR) band, calculating vegetation indices like the normalized difference vegetation index (NDVI) is not possible.As a result, segmenting vegetation using NDVI or other indices with NIR bands is also often not possible with point cloud data.In addition, utilizing point clouds with only RGB bands enables us to leverage older point clouds from a variety of sources lacking additional bands.Instead of relying on a hyperspectral band, a more robust approach that only utilizes RGB bands would be more valuable for a range of studies and environments.
Vegetation characterization and the filtering of point clouds can be accomplished using software, such as Agisoft Metashape 1.8.5 Professional [23] and CANUPO [24], or a combination thereof [25].Agisoft Metashape Professional can display and classify/re- An added challenge with coastal cliff monitoring is our digital representation of highrelief environments.Digital elevation models (DEMs) or digital surface models (DSMs) are often used to characterize landscape morphology and change over time.However, these 2D representations of the landscape oversimplify vertical and near-vertical slopes and may completely fail to represent any oversteepened or overhanging slopes where the cliff face or edge extends farther seaward than the rest of the cliff face.Furthermore, previous research demonstrates that vertical uncertainty with 2D landscape rasters increases as the landscape slope increases [12].Triangular irregular networks (TINs) represent a more nuanced approach to representing cliffs, although their analysis becomes significantly more challenging because each triangular face has an irregular shape and size, and they tend to be much larger files.Because coastal cliffs are very steep and may be overhanging, a more realistic approach to representing and analyzing LIDAR, TLS, or SfM-derived cliff data is to deal directly with the colorized point cloud.Since point clouds often only consist of red, green, and blue (RGB) reflectance bands and do not contain a near-infrared (NIR) band, calculating vegetation indices like the normalized difference vegetation index (NDVI) is not possible.As a result, segmenting vegetation using NDVI or other indices with NIR bands is also often not possible with point cloud data.In addition, utilizing point clouds with only RGB bands enables us to leverage older point clouds from a variety of sources lacking additional bands.Instead of relying on a hyperspectral band, a more robust approach that only utilizes RGB bands would be more valuable for a range of studies and environments.
Vegetation characterization and the filtering of point clouds can be accomplished using software, such as Agisoft Metashape 1.8.5 Professional [23] and CANUPO [24], or a combination thereof [25].Agisoft Metashape Professional can display and classify/reclassify dense point clouds as ground, vegetation, and a range of other natural and built classes; however, it struggles with vertical and overhanging surfaces, such as cliffs (Figure 2).CANUPO is available as standalone software or an extension to Cloud Compare [26] that can also reclassify point clouds into multiple classes and does have the ability to function with vertical and overhanging surfaces.It uses a probabilistic classifier across multiple scales in a "plane of maximal separability", although the resulting point cloud is prone to false positives for vegetation points, resulting in a speckling of noise in the reclassified point cloud (Figure 2).Other approaches, such as [25], operate using geometric and spectral signatures, and a random forest (RF) machine learning (ML) model may be more accurate but is not easily accessible and requires more powerful computing resources.While RF models may outperform other commercially available software and are more easily interpretable, they can quickly grow overly complex and large, requiring more computing resources.As such, this paper explores multi-layer perceptron (MLP) ML models as robust alternatives to existing classifiers because MLP models are simpler and faster to develop and can be more efficient than existing point cloud classifiers.classify dense point clouds as ground, vegetation, and a range of other natural and built classes; however, it struggles with vertical and overhanging surfaces, such as cliffs (Figure 2).CANUPO is available as standalone software or an extension to Cloud Compare [26] that can also reclassify point clouds into multiple classes and does have the ability to function with vertical and overhanging surfaces.It uses a probabilistic classifier across multiple scales in a "plane of maximal separability", although the resulting point cloud is prone to false positives for vegetation points, resulting in a speckling of noise in the reclassified point cloud (Figure 2).Other approaches, such as [25], operate using geometric and spectral signatures, and a random forest (RF) machine learning (ML) model may be more accurate but is not easily accessible and requires more powerful computing resources.While RF models may outperform other commercially available software and are more easily interpretable, they can quickly grow overly complex and large, requiring more computing resources.As such, this paper explores multi-layer perceptron (MLP) ML models as robust alternatives to existing classifiers because MLP models are simpler and faster to develop and can be more efficient than existing point cloud classifiers.This paper builds on previous research segmenting imagery [27,28] and point clouds [23][24][25][29][30][31] to test whether MLP models can be leveraged to segment dense coastal cliff point clouds into vegetation and bare-Earth points.Multiple model architectures and model inputs were tested to determine the most efficient, parsimonious, and robust ML models for vegetation segmentation.Multiple MLP models were compared using only RGB values against MLP models using RGB values plus one or more vegetation indices based on the previous literature.In addition, this paper tested whether the addition of 3D standard deviation (3D StDev) can improve MLP models for vegetation segmentation in point clouds.The feasibility of the MLP models was demonstrated using the SfM-dervied point clouds of Elwha Bluffs, located along the Strait of Juan de Fuca in Washington state, USA.The most accurate, efficient, and robust models are compared to other point cloud classifiers and the efficacy of this approach is highlighted using the above case study.This paper builds on previous research segmenting imagery [27,28] and point clouds [23][24][25][29][30][31] to test whether MLP models can be leveraged to segment dense coastal cliff point clouds into vegetation and bare-Earth points.Multiple model architectures and model inputs were tested to determine the most efficient, parsimonious, and robust ML models for vegetation segmentation.Multiple MLP models were compared using only RGB values against MLP models using RGB values plus one or more vegetation indices based on the previous literature.In addition, this paper tested whether the addition of 3D standard deviation (3D StDev) can improve MLP models for vegetation segmentation in point clouds.The feasibility of the MLP models was demonstrated using the SfM-dervied point clouds of Elwha Bluffs, located along the Strait of Juan de Fuca in Washington state, USA.The most accurate, efficient, and robust models are compared to other point cloud classifiers and the efficacy of this approach is highlighted using the above case study.

Vegetation Classification and Indices
Vegetation indices computed from RGB have been extensively used to segment vegetation from non-vegetation in imagery [31][32][33][34][35][36][37][38][39][40][41][42][43][44], and their development was often associated with commercial agriculture application(s) for evaluating crop or soil health when near-NIR or other bands were not available.However, these indices can be adapted by a broader remote-sensing community for more diverse applications such as segmenting vegetation from bare-Earth points in dense point clouds.Some vegetation indices like excess red (ExR; [36]), excess green (ExG; [35,40]), excess blue (ExB; [39]), and excess red minus green (ExRG; [37]) were relatively simple and efficient to compute while others were more complex and took longer to calculate (NGRDI: [45]; MGRVI: [46]; GLI: [47]; RGBVI: [46]; IKAW: [33]; GLA: [48]).Regardless of the complexity of the algorithm, it is possible to adapt such indices to segment vegetation from bare-Earth points in any number of ways.Figure 3 illustrates how a single image can be transformed by different vegetation indices and binarized to segment vegetation from non-vegetation pixels.To differentiate vegetation from bare-Earth pixels or points, a decision must be made about the threshold value used to separate the two classes.This process of thresholding can be done several ways, with the simplest being a user-defined manual threshold, with a more statistically robust approach being the use of Otsu's thresholding method [48].Manual thresholding is a brute force approach that is simpler to implement but may vary greatly by application, user, and the index or indices being thresholded.In this approach, the user specifies a single value that is used as a break point to differentiate one class from another class where points above this value belong to one class and points below the value belong to another class.Although manual thresholding may yield good results for some applications and users, the selection of a specific threshold value is highly subjective and may not be transferrable across other geographies or datasets.
Otsu's thresholding method [48] is a more statistically robust and less subjective approach than manual thresholding.Ref. [49] demonstrated that Otsu's thresholding was an effective approach to binarize a given index to two classes.This method assumes the index being binarized is bimodally distributed, where each mode represents a single class.If binarizing an index to vegetation and bare-Earth classes, one mode would represent vegetation pixels/points and the other mode would represent bare-Earth pixels/points.Given this assumption, a threshold between the two modes can be determined as the solution that maximizes the intermodal variability between the two mode peaks.Otsu's thresholding is more likely to be transferrable to other datasets and/or geographies, depending on how representative the vegetation index values are of other datasets.

Deriving 3D Standard Deviation
In addition to the RGB and vegetation indices, this paper tested whether the addition of StDev improved vegetation segmentation performance and/or accuracy.The 3D StDev was computed for every point in the dense point cloud iteratively, similar to the approach All indices used in this paper (Table 1) were well constrained with defined upper and lower limits ranging from −1 to 1.4 (ExR and ExB), −1 to 2 (ExG), −2.4 to 3 (ExGR), or −1 to 1 (NGRDI, MGRVI, GLI, RGBVI, IKAW, and GLA).To overcome the vanishing or exploding gradient issue with ML models and to ensure maximum transferability to additional locations and datasets, only vegetation indices with well constrained ranges were utilized here.The mathematics and development of each vegetation index are beyond the scope of this paper but may be found in the references herein.Furthermore, to address the issue of vanishing gradients due to different input ranges, all ML model inputs, including RGB and all vegetation indices, were normalized individually to range from 0 to 1 using Equation ( 1): where x i is a value for one point, x min is the minimum value for all points, and x max is the maximum value for all points.For RGB values, the minimum and maximum values were 0 and 255, respectively, corresponding to the 8-bit color depth range.For each vegetation index, the minimum and maximum values correspond to the value range in Table 1, and the minimum and maximum values for 3D StDev were set equal to 0 and the maximum 3D StDev value was calculated.
Table 1.Vegetation indices with well-constrained upper and lower value bounds explored in this research, where R, G, and B are the spectral values for the red, green, and blue channels, respectively.

Vegetation Index Formula
Value Range (Lower, Upper) Source [45] Modified Green Red Vegetation Index (MGRVI) To differentiate vegetation from bare-Earth pixels or points, a decision must be made about the threshold value used to separate the two classes.This process of thresholding can be done several ways, with the simplest being a user-defined manual threshold, with a more statistically robust approach being the use of Otsu's thresholding method [48].Manual thresholding is a brute force approach that is simpler to implement but may vary greatly by application, user, and the index or indices being thresholded.In this approach, the user specifies a single value that is used as a break point to differentiate one class from another class where points above this value belong to one class and points below the value belong to another class.Although manual thresholding may yield good results for some applications and users, the selection of a specific threshold value is highly subjective and may not be transferrable across other geographies or datasets.
Otsu's thresholding method [48] is a more statistically robust and less subjective approach than manual thresholding.Ref. [49] demonstrated that Otsu's thresholding was an effective approach to binarize a given index to two classes.This method assumes the index being binarized is bimodally distributed, where each mode represents a single class.If binarizing an index to vegetation and bare-Earth classes, one mode would represent vegetation pixels/points and the other mode would represent bare-Earth pixels/points.Given this assumption, a threshold between the two modes can be determined as the solution that maximizes the intermodal variability between the two mode peaks.Otsu's thresholding is more likely to be transferrable to other datasets and/or geographies, depending on how representative the vegetation index values are of other datasets.

ML Models 2.2.1. Deriving 3D Standard Deviation
In addition to the RGB and vegetation indices, this paper tested whether the addition of StDev improved vegetation segmentation performance and/or accuracy.The 3D StDev was computed for every point in the dense point cloud iteratively, similar to the approach by [50].Points were first spatially queried using a cKDTree available within the scipy.spatialpackage [51] and then standard deviation was computed for the center point by using all the points within the spatial query distance.Although there are other morphometrics [50,52] that may be valuable for vegetation segmentation, this work limited the morphometric input to 3D StDev as a first test of how influential additional point geometrics may be because 3D StDev is relatively simple.For testing and demonstration purposes, the 3D search radius was limited to 1.0 m, meaning that all points within a 1.0 m search radius were used to calculate the 3D StDev.If no points were within this search radius, then the point was not used in model training, which also meant that any point cloud being reclassified with the model had to have at least one point within the same search radius to be reclassified.The 1.0 m search radius used here was simply used as a test of whether including point cloud geometry metrics could improve the segmentation process.Because 3D StDev or other point cloud geometry metrics were computed for every point, applying a larger search radius significantly increased the time and computing resources required to derive the 3D StDev.Another option to decrease the computation time for geometry metrics is to first decimate the input point clouds and then derive the metrics, although eliminating points may also suppress the areas of points that would otherwise be leading indicators of larger erosion events in the future.As such, it is important that ML methods be able to segment whole point clouds without sub-sampling.

Multi-Layer Perceptron (MLP) Architecture and Inputs
This paper tested the utility of several MLP ML model architectures with different inputs, number of layers, and nodes per layer for efficient and accurate point cloud reclassification (Table A1).These MLP models were compared against the CANUPO classifier in CloudCompare because it is a well-established approach in a widely used software package.MLP models have been applied in previous work segmenting debris-covered glaciers [53], extracting coastlines [54], landslide detection and segmentation [55], and vegetation segmentation in imagery [56], although the current work tested MLP as a simpler standalone segmentation approach for dense point clouds.
MLP models used here were built, trained, validated, and applied using Tensorflow [57] with Keras [58] in Python.They all followed a similar workflow and architecture (Figure 4), where data is first normalized to 0 to 1 and then processed through one or more dense, fully connected layer(s).The simplest models had only one densely connected layer, while the most complex models had six densely connected layers with an increasing number of nodes per layer.Varying the model architecture facilitated determining the most parsimonious model suitable for accurate vegetation segmentation.Every model included a dropout layer with a 20% chance of node dropout before the output layer to avoid model overfitting.Models used an Adam optimizer [59,60] to schedule the learning rate and increase the likelihood of reaching a global minimum, and a rectified linear unit (ReLU) activation function was used to overcome vanishing gradients and increase training and model performance.
The models were trained using two input point clouds, each containing points belonging to a single class.One training point cloud contained only points visually identified as vegetation, and the other training point cloud contained only points identified as bare-Earth.Both training point clouds were an aggregate of either vegetation or bare-Earth points from multiple dates of the same area spanning different growing seasons and years.For example, vegetation points were first manually clipped from no less than 10 different point clouds, then these clipped point cloud segments were merged to one combined vegetation point cloud as an input for the machine learning models.This process was repeated for the bare-Earth points.Special care was taken to ensure that the vegetation and bare-Earth training dense clouds included a range of lighting conditions, and, in the case of vegetation points, the training point cloud included different growing seasons and years, accounting for natural variability in vegetation growth caused by wet and dry seasons/years.
Class imbalance issues can significantly bias ML models toward over-represented classes and minimize or completely fail to learn under-represented class characteristics [61][62][63][64][65][66]   Vegetation and bare-Earth point clouds were split 70-30 for model development and training, with 70% of the data used to train the model (i.e., "training data") and the remaining 30% used to independently evaluate the model after training (i.e., "evaluation data").Of the training data split, 70% was used directly for training and 30% was used to validate the model after each epoch (i.e., "validation data").All sampling of training, validation, and evaluation sets was random.Since class imbalance was addressed prior to splitting the data, model overfitting to over-represented classes was mitigated and evaluation metrics were more reliable.
The paper explored whether MLP models were more accurate when model inputs included one or more vegetation indices or geometric derivatives compared to when the only model inputs were RGB values.Models were broadly categorized into one of five categories based on their inputs:

•
RGB: These models only included RGB values as model inputs; • RGB_SIMPLE: These models included the RGB values as well as ExR, ExG, ExB, and ExRG vegetation indices; these four indices were included because each one is relatively simple, abundant in previously published literature, and efficient to calculate; • ALL: These models included RGB and all stable vegetation indices listed in Table 1; • SDRGB: These models included RGB and the 3D StDev computed using the X, Y, and Z coordinates of every point within a given radius; • XYZRGB: These models included RGB and the XYZ coordinate values for every point.

ML Model Evaluation
Model performance was recorded during the training process, and the final model was re-evaluated using the testing set of points withheld during the training process.The number of tunable parameters was compared across different ML model architectures and inputs to determine which models and inputs were most efficient to train and apply.Accuracy was displayed and recorded after every epoch during model training, and the final model was independently evaluated using the evaluation points, which were not used during model training or validation.The last reported accuracy during training was recorded as the model training accuracy (TrAcc), while the model evaluation accuracy (EvAcc) was computed using the Tensorflow model.evaluate()function with the evaluation data split.Unfortunately, independent ground truth observations were not made in the field, and it is not possible to evaluate the performance more rigorously against a field-collected set of points.However, given the very large number of evaluation points (~13,470,000 points), split evenly between bare-Earth and vegetation points, the EvAcc serves as a reasonable measure of model accuracy.

Case Study: Elwha Bluffs, Washington, USA
The ability and performance of the MLP models to classify bare-Earth and vegetation points was tested on a section of bluffs near Port Angeles, WA, located along the Strait of Juan de Fuca just east of the Elwha River mouth (Figure 5).Although erosion has slowed with the growth of a fronting beach, large sections of the bluff still erode periodically in response to changing environmental conditions and storms.Previous research demonstrated the efficacy and accuracy of deriving high-accuracy point clouds from SfM with oblique photos [19].Although the complete photogrammetry pipeline includes SfM for photo alignment and multiview stereo (MVS) for constructing the dense cloud, "SfM" will be used here to refer to the complete SfM-MVS data pipeline.ML model development and testing were done using a SfM-derived point cloud for 8 May 2020 that was aligned and filtered using the 4D approach with individual camera calibrations and differential GPS (Figure 5) because this approach [68] produced the most accurate point cloud compared to other approaches tested by [19].
SfM-derived point clouds are advantageous over LIDAR or TLS point clouds because the SfM photogrammetry process natively includes RGB values for every point.LIDAR and TLS record the geometry of the return, along with the intensity of the return.The RGB colors often contained in LIDAR or TLS point clouds are added during post-processing.
The Elwha Bluffs dataset highlights the importance of being able to efficiently segment vegetation from bare-Earth points since photos of these bluffs periodically show fresh slump faces and talus deposition at the base of the bluff (visible in Figure 5).Detecting micro-deformation or erosion of the bluff face is important for predicting bluff stability, and monitoring these fine-scale changes requires being able to identify and distinguish true erosion from false positives, which may be caused by vegetation growth or senescence through the seasons.Vegetation growth over the summer or across years, especially for vegetation near the bluff base, can appear as "positive" change from the bluff face which may suggest the bluff has accreted, irrespective of the true change in bluff face (erosion or deposition) in the area.Conversely, if vegetation points are included in a time-change analysis, vegetation senescence can introduce false positives for erosion.As the leaves and grasses die again in the fall and winter, subtracting a point cloud from fall or winter from one during the summer may falsely indicate erosion is occurring.The Elwha Bluffs dataset highlights the importance of being able to efficiently segment vegetation from bare-Earth points since photos of these bluffs periodically show fresh slump faces and talus deposition at the base of the bluff (visible in Figure 5).Detecting micro-deformation or erosion of the bluff face is important for predicting bluff stability, and monitoring these fine-scale changes requires being able to identify and distinguish true erosion from false positives, which may be caused by vegetation growth or senescence through the seasons.Vegetation growth over the summer or across years, especially for vegetation near the bluff base, can appear as "positive" change from the bluff face which may suggest the bluff has accreted, irrespective of the true change in bluff face (erosion or deposition) in the area.Conversely, if vegetation points are included in a timechange analysis, vegetation senescence can introduce false positives for erosion.As the leaves and grasses die again in the fall and winter, subtracting a point cloud from fall or winter from one during the summer may falsely indicate erosion is occurring.
While identifying bare-Earth-only points can be done manually on one or few datasets, manually segmenting these bare-Earth points is not scalable to point clouds covering large areas or those areas with a large time-series of point clouds.The manual segmentation approach is also more subjective, potentially varying by user and season, and increasingly time-consuming as the number of datasets continues to increase with ongoing photo surveys.While identifying bare-Earth-only points can be done manually on one or few datasets, manually segmenting these bare-Earth points is not scalable to point clouds covering large areas or those areas with a large time-series of point clouds.The manual segmentation approach is also more subjective, potentially varying by user and season, and increasingly time-consuming as the number of datasets continues to increase with ongoing photo surveys.

Results
Point cloud classification with the CANUPO classifier in CloudCompare had a reported accuracy of 88.7%.However, visual inspection of the reclassified point cloud (Figure 2) shows salt-and-pepper "noise" of points incorrectly classified as vegetation throughout the bluff face.In contrast, point clouds reclassified with MLP models exhibited less salt-and-pepper noise than was present in the CANUPO reclassified points.MLP-model reclassified points appeared visually consistent with areas of vegetation and bare-Earth points in the original point cloud (Figure 6), with little variability between the RGB, RGB_SIMPLE, and ALL models.
ure 2) shows salt-and-pepper "noise" of points incorrectly classified as vegetation throughout the bluff face.In contrast, point clouds reclassified with MLP models exhibited less salt-and-pepper noise than was present in the CANUPO reclassified points.MLPmodel reclassified points appeared visually consistent with areas of vegetation and bare-Earth points in the original point cloud (Figure 6), with little variability between the RGB, RGB_SIMPLE, and ALL models.MLP models had a variable EvAcc accuracy (Table A2; Figure 7) and the average EvAcc accuracy was 91.2%.XYZRGB models were substantially less accurate than all other models and had an accuracy of only 50.0%, while the minimum EvAcc accuracy of any other model was 89.5% (RGB_8_8).The most accurate model was SDRGB_16_16_16 (EvAcc: 95.3%), although computing the standard deviation for every point took significantly longer than the model training process for either model using standard deviation.When MLP model architecture was held constant for both the number of layers and nodes per layer, the SDRGB model was only 1.3% more accurate than the RGB model.MLP models had a variable EvAcc accuracy (Table A2; Figure 7) and the average EvAcc accuracy was 91.2%.XYZRGB models were substantially less accurate than all other models and had an accuracy of only 50.0%, while the minimum EvAcc accuracy of any other model was 89.5% (RGB_8_8).The most accurate model was SDRGB_16_16_16 (EvAcc: 95.3%), although computing the standard deviation for every point took significantly longer than the model training process for either model using standard deviation.When MLP model architecture was held constant for both the number of layers and nodes per layer, the SDRGB model was only 1.3% more accurate than the RGB model.
For models where inputs were the same (e.g., all RGB, RGB_SIMPLE, or ALL models), simpler models generally performed well with significantly fewer tunable parameters than their more complex counterparts with several layers and more nodes per layer (Table A2).EvAcc improved slightly as the number of layers and nodes per layer increased but began to asymptote when more than 2-3 layers were included.For example, the RGB_16 model had an EvAcc of 92.1%, and the RGB_16_32 model had an EvAcc of 93.8%,only a 1.7% improvement in EvAcc.However, the most complex RGB model tested included six layers with nodes doubling every successive layer from 16 to 512 nodes (RGB_16_32_64_128_256_512) and had 176,161 tunable parameters and an EvAcc of 93.9%.The most complex RGB model was only 0.1% more accurate than the simpler RGB_16_32.In addition, the number of tunable parameters substantially increased from 81 (RGB_16) and 641 (RGB_16_32) parameters to 176,161 parameters with RGB_16_32_64_128_256_512.This relationship between EvAcc and tunable parameters was also true among (a) RGB_SIMPLE and (b) ALL models.For models where inputs were the same (e.g., all RGB, RGB_SIMPLE, or ALL models), simpler models generally performed well with significantly fewer tunable parameters than their more complex counterparts with several layers and more nodes per layer (Table A2).EvAcc improved slightly as the number of layers and nodes per layer increased but began to asymptote when more than 2-3 layers were included.For example, the RGB_16 model had an EvAcc of 92.1%, and the RGB_16_32 model had an EvAcc of 93.8%,only a 1.7% improvement in EvAcc.However, the most complex RGB model tested included six layers with nodes doubling every successive layer from 16 to 512 nodes (RGB_16_32_64_128_256_512) and had 176,161 tunable parameters and an EvAcc of 93.9%.The most complex RGB model was only 0.1% more accurate than the simpler RGB_16_32.In addition, the number of tunable parameters substantially increased from 81 (RGB_16) and 641 (RGB_16_32) parameters to 176,161 parameters with RGB_16_32_64_128_256_512.This relationship between EvAcc and tunable parameters was also true among (a) RGB_SIMPLE and (b) ALL models.
All segmented point clouds had some minor salt-and-pepper noise caused by the misclassification of some points.This is expected because no model is 100% accurate and this is true regardless of the number of layers in the model (Figure 8).It is important to All segmented point clouds had some minor salt-and-pepper noise caused by the misclassification of some points.This is expected because no model is 100% accurate and this is true regardless of the number of layers in the model (Figure 8).It is important to note that the point clouds did not show any systematic misclassification either way.Although Figure 8 only shows the results for RGB models, this relationship was true for all model types.
When model architecture was held constant and only the model inputs varied, RGB models performed well compared to RGB_SIMPLE and ALL models (Table A2).The RGB_16 model only utilized the RGB values as inputs and had an EvAcc of 92.14% with 81 tunable parameters.This compared favorably to the ALL_16 model that included all vegetation indices as inputs and had an EvAcc of 93.53% with 241 tunable parameters.The number of tunable parameters increased with the number of inputs for all models.ALL_16 model EvAcc was only marginally better (~1.39%) than RGB_16 EvAcc.models performed well compared to RGB_SIMPLE and ALL models (Table A2).The RGB_16 model only utilized the RGB values as inputs and had an EvAcc of 92.14% with 81 tunable parameters.This compared favorably to the ALL_16 model that included all vegetation indices as inputs and had an EvAcc of 93.53% with 241 tunable parameters.The number of tunable parameters increased with the number of inputs for all models.ALL_16 model EvAcc was only marginally better (~1.39%) than RGB_16 EvAcc.Other models explored here had the number of nodes constant across all layers while the number of layers changed.For example, RGB_16_16 had two layers with 16 nodes per layer, and RGB_16_16_16 had three layers with 16 nodes per layer.Models with the same number of nodes but varying layer numbers had comparable EvAcc regardless of whether the model had 2 or 3 layers or 8 or 16 nodes per layer.Aside from the RGB_8_8 model, where EvAcc was only 89.5%, the EvAcc for these relatively simple models was similar for RGB, RGB_SIMPLE, and ALL models.Three-layer models with the same number of nodes per layer generally had a higher EvAcc compared to the equivalent two-layer models, although the difference in EvAcc led to an improvement of less than 1% in most cases.No salt-and-pepper noise was visible in the bluff face for any of the 2-or 3-layer models with 8 or 16 nodes per layer (Figure 9).
Other models explored here had the number of nodes constant across all layers while the number of layers changed.For example, RGB_16_16 had two layers with 16 nodes per layer, and RGB_16_16_16 had three layers with 16 nodes per layer.Models with the same number of nodes but varying layer numbers had comparable EvAcc regardless of whether the model had 2 or 3 layers or 8 or 16 nodes per layer.Aside from the RGB_8_8 model, where EvAcc was only 89.5%, the EvAcc for these relatively simple models was similar for RGB, RGB_SIMPLE, and ALL models.Three-layer models with the same number of nodes per layer generally had a higher EvAcc compared to the equivalent two-layer models, although the difference in EvAcc led to an improvement of less than 1% in most cases.No salt-and-pepper noise was visible in the bluff face for any of the 2-or 3-layer models with 8 or 16 nodes per layer (Figure 9).Models with only RGB inputs had the fewest number of tunable parameters across comparable model architectures (Table A2; Figure 10), while models using RGB, and all vegetation indices, had the most tunable model parameters.When the model inputs were held constant, increasing the number of layers and nodes per layer also increased the number of tunable parameters (Table A2; Figure 10).The model with the most parameters included RGB and all vegetation indices as inputs and had 6 layers starting with 16 nodes, doubling the number of nodes with each successive layer.Model EvAcc generally increased logarithmically with the number of tunable parameters until reaching a quasiplateau ~94% accuracy (Figures 7 and 10).Models with only RGB inputs had the fewest number of tunable parameters across comparable model architectures (Table A2; Figure 10), while models using RGB, and all vegetation indices, had the most tunable model parameters.When the model inputs were held constant, increasing the number of layers and nodes per layer also increased the number of tunable parameters (Table A2; Figure 10).The model with the most parameters included RGB and all vegetation indices as inputs and had 6 layers starting with 16 nodes, doubling the number of nodes with each successive layer.Model EvAcc generally increased logarithmically with the number of tunable parameters until reaching a quasi-plateau ~94% accuracy (Figures 7 and 10).

Discussion
Results demonstrate that MLP models can efficiently and accurately segment vegetation and bare-Earth points in high-relief data, such as coastal cliffs and bluffs, with or without computing any vegetation indices.EvAcc sharply increased from very simple models and plateaued around ~94%, even as the number of tunable parameters continued to increase.Ref. [29] suggested that vegetation indices may be valuable in segmenting vegetation, although the comparison of MLP models with RGB against those with RGB_SIMPLE or ALL here suggests that vegetation indices may not substantially improve our ability to separate vegetation from bare-Earth points in coastal cliff point clouds.While vegetation indices may provide a marginal increase in accuracy for other ML approaches [29], the increase in the number of tunable parameters for MLP models with increasing complexity was disproportionate to the relatively minor improvement in EvAcc.For example, ALL_16 had 2.9 times more tunable parameters than RGB_16 but was only 1.5% more accurate.In all models tested, the number of tunable parameters substantially outpaced the gain in EvAcc when vegetation indices were included as model

Discussion
Results demonstrate that MLP models can efficiently and accurately segment vegetation and bare-Earth points in high-relief data, such as coastal cliffs and bluffs, with or without computing any vegetation indices.EvAcc sharply increased from very simple models and plateaued around ~94%, even as the number of tunable parameters continued to increase.Ref. [29] suggested that vegetation indices may be valuable in segmenting vegetation, although the comparison of MLP models with RGB against those with RGB_SIMPLE or ALL here suggests that vegetation indices may not substantially improve our ability to separate vegetation from bare-Earth points in coastal cliff point clouds.While vegetation indices may provide a marginal increase in accuracy for other ML approaches [29], the increase in the number of tunable parameters for MLP models with increasing complexity was disproportionate to the relatively minor improvement in EvAcc.For example, ALL_16 had 2.9 times more tunable parameters than RGB_16 but was only 1.5% more accurate.In all models tested, the number of tunable parameters substantially outpaced the gain in EvAcc when vegetation indices were included as model inputs.This suggests that, although the RGB models may be ≤1% less accurate than models using vegetation indices, this reduction in accuracy was not significant and was outweighed by the more efficient training and deployment of the RGB models.
An advantage of using only RGB color values as model inputs is the reduction in data redundancy.Because vegetation indices were derived from RGB values, the information encoded in them is somewhat redundant when the derived indices are included with RGB.Furthermore, not having to compute one or more vegetation indices can reduce the amount of time and computing resources required for data pre-processing.Computing vegetation indices for every point in the training point clouds and for every point in the point cloud being reclassified can be computationally demanding, and these demands increase very rapidly as the size of the input point clouds increases since computing an additional vegetation index requires storing another set of index values equal to the size of the original input data.Employing a simpler model with only RGB inputs eliminates the need for extensive pre-processing, and reduces RAM and other hardware requirements, while also segmenting vegetation points from bare-Earth points with comparable EvAcc to a more complex model that includes one or more pre-computed vegetation indices.
The most accurate models tested used RGB values and a pointwise StDev, in-line with previous work suggesting that local geometric derivatives may be useful in segmenting point clouds [50,52,69].However, the SDRGB models were challenging to implement due to the high computational demands of computing StDev for every point in the point cloud(s).Similar to [50], we used cKDTrees to optimize the spatial query; however, performing this spatial query for a point and then computing the StDev for the nearby points was not efficient when multiplied for very large and dense point clouds such as those used in this work.In this way, the current optimized solution to deriving spatial metrics was not scalable to a very dense and large point cloud dataset.Further research may optimize this spatial query and geometry computation process, although this is beyond the scope of the current paper.SDRGB models were not considered reasonable solutions to vegetation classification and segmentation because of the demanding pre-processing required to yield only a ~2% more accurate model than RGB models.
Figures 7 and 10, in conjunction with no discernable improvement in complex models over simpler ones, indicate diminishing returns with model complexity.Seeking the simplest model (i.e., fewest tunable parameters) that provides an acceptable accuracy, two general classes of models stand out: (1) models with 3 layers and 8 nodes per layer (RGB_8_8_8, RGB_SIMPLE_8_8_8, and ALL_8_8_8), and (2) models with 2 layers and 16 nodes per layer (RGB_16_16, RGB_SIMPLE_16_16, and ALL_16_16).Both classes of models had accuracies around 94% with fewer tunable parameters than their more complex counterparts, regardless of the different inputs.For example, RGB_16_32 had 641 tunable parameters and an EvAcc of 93.8% whereas the simpler RGB_16_16 had 55% fewer tunable parameters (353 parameters) and a greater EvAcc of 93.86%.These results suggest that the simpler 2-to 3-layer models with 8-16 nodes per layer can perform as well or better than larger models with more tunable parameters and are, therefore, preferred over larger models.
Results demonstrate that ML models can efficiently and accurately segment vegetation and bare-Earth points in addition to the potential one-shot transfer of information to new areas and dataset.With manual point cloud segmentation, one-shot knowledge transfer is not possible, limiting rapid assessment of coastal erosion.Leveraging patterns and information from one location and application and transferring this knowledge to a new dataset and geographies enables agencies, communities, and researchers to better map and monitor ever-changing coastal cliffs and bluffs.
MLP models are less subjective than manual segmentation or other simpler approaches.Manual segmentation or the thresholding of specific vegetation indices can be biased towards different vegetation types and environments.In contrast, MLP models using only RGB were not affected by the subjective decision to add one or more vegetation indices.Using an RGB-only ML model eliminated the need to make this arbitrary decision between one index or another.Instead, the model uses the RGB values directly and does not require extra computations.
The models trained and presented here used colorized point clouds with red, green, and blue channels only.Not having access to additional reflectance bands, such as nearinfrared, limits segmenting vegetation from bare-Earth points with even greater accuracy as well as limiting segmenting different types of vegetation.The utility of NDVI or similar multispectral vegetation indices in vegetation segmentation and identification is demonstrated [46,70]; although, [71] suggested that RGB models may be tuned to perform closely to models with hyperspectral channels.In addition, near-infrared or other hyperspectral channels are often not available for point cloud data; thus, developing ML methods for using simple RGB data increases the amount of data available to coastal monitoring.

Conclusions
Monitoring high-relief coastal cliffs and bluffs for deformation or small-scale erosion is important for understanding and predicting future cliff erosion events, although our ability to detect such changes hinges on distinguishing false positive changes from real change.Segmenting coastal cliff and bluff point clouds into vegetation and bare-Earth points can help provide insight into these changes by reducing "noise" in the change results.Automated point cloud segmentation using MLP ML models is feasible and has several advantages over existing approaches and vegetation indices.Leveraging MLP models for point cloud segmentation is more efficient and scalable than a manual approach while also being less subjective.Results demonstrate that models using only RGB color values as inputs performed equal to mode complex models with one or more vegetation indices, and simpler models with 2-3 layers and a constant number of nodes per layer (8 or 16 nodes) outperformed more complex models with 1-6 layers but an increasing number of nodes per layer.Simpler models with only RGB inputs can help effectively segment vegetation and bare-Earth points in large coastal cliff and bluff point clouds.Segmenting and filtering vegetation from coastal cliff point clouds can help identify areas of true geomorphic change and improve coastal monitoring efficiency by eliminating the need for manual data editing.

Conflicts of Interest:
The author declares no conflicts of interest.

Figure 1 .
Figure 1.Oblique view of an SfM point cloud shows vegetation overhanging the bluff face and obscuring the bare-Earth bluff face, as well as several fallen tree trunks.

Figure 1 .
Figure 1.Oblique view of an SfM point cloud shows vegetation overhanging the bluff face and obscuring the bare-Earth bluff face, as well as several fallen tree trunks.

21 Figure 3 .
Figure 3. Vegetation indices can be computed from RGB bands to segment vegetation from nonvegetation pixels in an individual image.Some examples of different vegetation indices computed from a raw RGB image (a) are (b) ExB, (c) ExG, (d) NGRDI, (e) RGBVI, and (f) GLI.

Figure 3 .
Figure 3. Vegetation indices can be computed from RGB bands to segment vegetation from nonvegetation pixels in an individual image.Some examples of different vegetation indices computed from a raw RGB image (a) are (b) ExB, (c) ExG, (d) NGRDI, (e) RGBVI, and (f) GLI.
, a problem not unique to vegetation segmentation.Although the vegetation and bare-Earth training point clouds were different sizes(22,451,839 vegetation points and 102,539,815 bare-Earth points), class imbalance was addressed here by randomly down-sampling the bare-Earth points to equal the number of vegetation points prior to model training.A similar approach to balancing training data for point cloud segmentation was employed by[67].The balanced training classes were then used to train and evaluate the MLP models.

Figure 4 .
Figure 4. MLP model architectures tested in this paper followed a similar structure.All input data were linearly normalized to range from 0 to 1.Some models included one or more additional vegetation indices (as indicated by the dashed line on the right), derived from the normalized RGB values.Yet other models included a computed 3D standard deviation value for every point as a model input (see the dashed line on the left).All inputs were concatenated in an input layer and then fed into one or more dense, fully connected layers, where the number of layers and nodes per layer varied (as represented by the *) to test model architectures.The dropout layer with probability 0.2 was subsequently added after these dense layer(s) and before the final dense layer, which had 2 nodes, corresponding to the number of classes included in the model.The final layer was an activation layer that condensed the probabilities from the 2-layer dense layer into a single class label.Class imbalance issues can significantly bias ML models toward over-represented classes and minimize or completely fail to learn under-represented class characteristics [61-66], a problem not unique to vegetation segmentation.Although the vegetation and bare-Earth training point clouds were different sizes (22,451,839 vegetation points and 102,539,815 bare-Earth points), class imbalance was addressed here by randomly downsampling the bare-Earth points to equal the number of vegetation points prior to model training.A similar approach to balancing training data for point cloud segmentation was employed by [67].The balanced training classes were then used to train and evaluate the MLP models.Training data for the ML models were created by first manually clipping vegetationonly points from several point clouds in CloudCompare.Because this paper is focused on developing a model(s) robust to season, lighting conditions, and year, both sets of training point clouds (vegetation and bare-Earth) were composed of samples from different growing seasons and years across the 30+ photosets from[68].All the clipped points representing vegetation were merged into a single vegetation training point cloud, and all clipped point clouds representing bare-Earth points were merged into a combined bare-Earth training point cloud.Vegetation and bare-Earth point clouds were split 70-30 for model development and training, with 70% of the data used to train the model (i.e., "training data") and the remaining 30% used to independently evaluate the model after training (i.e., "evaluation data").Of the training data split, 70% was used directly for training and 30% was used to validate the model after each epoch (i.e., "validation data").All sampling of training, validation, and evaluation sets was random.Since class imbalance was addressed prior to

Figure 4 .
Figure 4. MLP model architectures tested in this paper followed a similar structure.All input data were linearly normalized to range from 0 to 1.Some models included one or more additional vegetation indices (as indicated by the dashed line on the right), derived from the normalized RGB values.Yet other models included a computed 3D standard deviation value for every point as a model input (see the dashed line on the left).All inputs were concatenated in an input layer and then fed into one or more dense, fully connected layers, where the number of layers and nodes per layer varied (as represented by the *) to test model architectures.The dropout layer with probability 0.2 was subsequently added after these dense layer(s) and before the final dense layer, which had 2 nodes, corresponding to the number of classes included in the model.The final layer was an activation layer that condensed the probabilities from the 2-layer dense layer into a single class label.Training data for the ML models were created by first manually clipping vegetationonly points from several point clouds in CloudCompare.Because this paper is focused on developing a model(s) robust to season, lighting conditions, and year, both sets of training point clouds (vegetation and bare-Earth) were composed of samples from different growing seasons and years across the 30+ photosets from[68].All the clipped points representing vegetation were merged into a single vegetation training point cloud, and all clipped point clouds representing bare-Earth points were merged into a combined bare-Earth training point cloud.Vegetation and bare-Earth point clouds were split 70-30 for model development and training, with 70% of the data used to train the model (i.e., "training data") and the remaining 30% used to independently evaluate the model after training (i.e., "evaluation data").Of the training data split, 70% was used directly for training and 30% was used to validate the model after each epoch (i.e., "validation data").All sampling of training, validation, and evaluation sets was random.Since class imbalance was addressed prior to splitting the data, model overfitting to over-represented classes was mitigated and evaluation metrics were more reliable.The paper explored whether MLP models were more accurate when model inputs included one or more vegetation indices or geometric derivatives compared to when the only model inputs were RGB values.Models were broadly categorized into one of five categories based on their inputs:

21 Figure 5 .
Figure 5.The Elwha Bluffs are located east of the Elwha River Delta along the Strait of Juan de Fuca Washington coast (modified from [15].A SfM-derived point cloud from 8 May 2020 shows vegetation obscuring the bare-Earth bluff face.

Figure 5 .
Figure 5.The Elwha Bluffs are located east of the Elwha River Delta along the Strait of Juan de Fuca Washington coast (modified from [15].A SfM-derived point cloud from 8 May 2020 shows vegetation obscuring the bare-Earth bluff face.

Figure 6 .
Figure 6.Coastal bluff point cloud for 8 May 2020 reclassified using MLP models with one layer of 16 nodes but different inputs.

Figure 6 .
Figure 6.Coastal bluff point cloud for 8 May 2020 reclassified using MLP models with one layer of 16 nodes but different inputs.

Figure 7 .
Figure 7. Accuracy varied by model complexity and size, as measured by the number of tunable parameters.

Figure 7 .
Figure 7. Accuracy varied by model complexity and size, as measured by the number of tunable parameters.

Figure 8 .
Figure 8. Point cloud from 8 May 2020 reclassified with different MLP models where the number of nodes per layer and number of layers increased from 1 layer with 16 nodes to 6 layers with 16 to 512 nodes.

Figure 8 .
Figure 8. Point cloud from 8 May 2020 reclassified with different MLP models where the number of nodes per layer and number of layers increased from 1 layer with 16 nodes to 6 layers with 16 to 512 nodes.

Figure 9 .
Figure 9. Point cloud from 8 May 2020 reclassified with different MLP models with relatively simple architectures of 2 layers with 8 nodes each or 3 layers of 16 nodes each.

Figure 9 .
Figure 9. Point cloud from 8 May 2020 reclassified with different MLP models with relatively simple architectures of 2 layers with 8 nodes each or 3 layers of 16 nodes each.

Figure 10 .
Figure 10.Accuracy varied by the number of tunable parameters.

Figure 10 .
Figure 10.Accuracy varied by the number of tunable parameters.

Table A1 .
MLP model architectures tested for vegetation segmentation of point cloud data.

Table A2 .
MLP model architecture, training information (training epochs and time), and accuracies (TrAcc and EvAcc) for all vegetation classification and segmentation models tested.