Vegetation Filtering of a Steep Rugged Terrain: The Performance of Standard Algorithms and a Newly Proposed Workﬂow on an Example of a Railway Ledge

: Point clouds derived using structure from motion (SfM) algorithms from unmanned aerial vehicles (UAVs) are increasingly used in civil engineering practice. This includes areas such as (vegetated) rock outcrops or faces above linear constructions (e.g., railways) where accurate terrain identiﬁcation, i.e., ground ﬁltering, is highly difﬁcult but, at the same time, important for safety management. In this paper, we evaluated the performance of standard geometrical ground ﬁltering algorithms (a progressive morphological ﬁlter (PMF), a simple morphological ﬁlter (SMRF) or a cloth simulation ﬁlter (CSF)) and a structural ﬁlter, CANUPO (CAract é risation de NUages de POints), for ground identiﬁcation in a point cloud derived by SfM from UAV imagery in such an area (a railway ledge and the adjacent rock face). The performance was evaluated both in the original position and after levelling the point cloud (its transformation into the horizontal plane). The poor results of geometrical ﬁlters (total errors of approximately 6–60% with PMF performing the worst) and a mediocre result of CANUPO (approximately 4%) led us to combine these complementary approaches, yielding total errors of 1.2% (CANUPO+SMRF) and 0.9% (CANUPO+CSF). This new technique could represent an excellent solution for ground ﬁltering of high-density point clouds of such steep vegetated areas that can be well-used, for example, in civil engineering practice.


Introduction
In engineering practice, unmanned aerial vehicles (UAVs) equipped with cameras are now increasingly used for the acquisition of 3D data. However, it is often difficult to acquire detailed data of specific terrain profiles such as steep rugged rock terrain covered with vegetation. Such terrain is common in the vicinity of linear structures (railways, roads, etc.) as well as in natural areas including protected areas in the mountains and other nature reserves. Although such steep rugged terrain can be pleasing to the eye, it can also pose a major risk to safety. This is particularly true for the aforementioned linear constructions where objects falling from such slopes can cause potentially lethal accidents not to mention the financial damage (e.g., a train derailed due to a boulder on the track). For this reason, safety measures (e.g., netting preventing objects from falling) are widely employed.
These measures are, however, costly and, for this reason, a thorough consideration of where, what type and to what extent to apply them is needed. This, in turn, requires quality data for decision making. Field surveys can be difficult in such terrain and may fail to capture the entire area due to poor accessibility. For this reason, remote sensing methods are increasingly used. In addition to the aforementioned UAV, typically combined with photogrammetry (especially structure from motion (SfM) or 3D scanning (terrestrial, airborne/mobile)), light detection and ranging (LiDAR) can be used. Once data are acquired, however, it is necessary to acquire terrain data by removing points representing undesirable phenomena from the dataset such as vegetation or other objects. Using suitable techniques, vegetation removal can be to a certain extent performed even before the measurement itself (for example, using the leaf-off period for data acquisition); nevertheless, much of the unwanted data elimination must be performed after the data acquisition, i.e., through data filtering using various (classification) algorithms. Filtering algorithms classify the clouds into data describing the surface of the object of interest (typically terrain) and other data (e.g., vegetation). Of course, the existence of the points lying directly on the surface of interest in the point cloud is a necessary precondition for successful filtering.
The point cloud classification for vegetation removal (and/or ground filtering, which removes vegetation also other objects such as buildings) has been extensively studied (see, for example, [1][2][3][4][5][6][7][8]), focusing on filtering algorithms based on surface geometry. Many geometry-based algorithms, such as the progressive morphological filter (PMF, [9]), the simple morphological filter (SMRF, [10]) or the cloth simulation filter (CSF, [11]), have been implemented in commercial programs (Agisoft Metashape, 3D Reshaper, etc.). For the algorithm to yield optimal results, it is, however, usually necessary to choose a correct combination of parameter settings for the given terrain type and vegetation coverage, which can be often problematic especially in difficult terrains. Filtering algorithms are highly suitable for LiDAR data processing as LiDAR, thanks to its capability of registering multiple returns after a partial penetration through vegetation, can acquire data even below the superficial vegetation cover [12][13][14]. LiDAR data are typically acquired by high-altitude scanning, which is known to be associated with a problematic identification of almost vertical terrain and may be unsuitable for a detailed evaluation of rugged terrain due to their relatively low density of just a few points per square meter. LiDAR scanners that can be mounted on the UAVs and can provide a detailed recording of the surface are at present prohibitively expensive. For this reason, the SfM processing of UAV imagery is a promising alternative.
With the boom of unmanned aerial vehicles in the last decade, many applications of such systems have been reported; for example, in the fields of archaeology [15], ecology [16], natural disaster monitoring [17][18][19], solar radiation [20], bark beetle infestation [21], the differentiation of various vegetation types [22] or reclamation [23], usually evaluating UAV data using aerial photogrammetry (the SfM method). Compared to LiDAR data, photogrammetric data are much more sensitive to the quality of imagery, which is defined by the correct sensing methodology (distance, resolution, sensing angle, etc.) and by the setup of the computational process [24][25][26][27]. Several studies have also combined laser scanning and photogrammetry using synergies between the two ( [28,29]).
However, even if dealing with highly detailed data, the correct adjustment of the parameters of the filtering algorithms, especially in rugged terrain, may be very important for achieving high detail and the best possible description of the surface. Several studies [30][31][32], therefore, have focused on comparing and testing filtering algorithms on specific data to determine the best possible algorithm settings; such studies often use open-source software implementing one or more of such algorithms for this purpose ( [33,34]).
Most ground filtering algorithms have been created for the analysis of standard highaltitude airborne scanning data in which the ground surface appears principally flat (and almost level). As shown in multiple studies [35][36][37][38], many filtering algorithms can be applied on low-slope terrain with relatively good results. The automatic filtering of SfM data acquired from extremely steep and rugged terrain (rock faces, overhangs, etc.) is, however, difficult and must often be performed manually, which is extremely laborious. In view of the need for such data for safety purposes, we aimed in this study to (i) optimize the standard geometric filters (PMF, SMRF, CSF) for filtering UAV-derived SfM data acquired in a steep rugged terrain and evaluate their performance in the original orientation and (ii) after levelling the data into the horizontal position, expected to be more suitable for the operation of these filters. In addition, we aimed to (iii) assess the performance of a principally different CANUPO filter, (iv) investigate the potential synergisms between the CANUPO and geometrical filters by using a sequential combination of these filters and assess the performance of such a combination and (v), if the results permit, to propose a new workflow for the vegetation filtering of such data.

Data for Testing
Data for testing covered an area of 157 × 49 m with an elevation difference of 47 m (see Figure 1 for the aerial view of the study area). Due to the steep character of the terrain, the imagery was acquired with a variable angle of acquisition (perpendicular to the terrain); the UAV was manually piloted and the image acquisition was performed more or less from a constant distance of 15-30 m perpendicular to the terrain. In all, 248 images with a resolution of 4056 × 3040 px were taken. The axis of the images was approximately perpendicular to the terrain (i.e., most images were oblique relative to the vegetation). For georeferencing, 11 ground control points (GCPs) were used, 3 of which were in the upper part of the slope/rocks and the remainder on more accessible spots along the base of the slope. Individual GCPs were georeferenced using a total station connected to the coordinate system by the measurement of three distant points determined by the GNSS RTK receiver. This method was used because, in view of the terrain characteristics, a large part of the sky was obstructed by the slope, thus reducing the accuracy of the GNSS RTK measurement directly on site due to the unavailability of a sufficient number of satellites. In addition, 30 control points lying on the rocks (without vegetation obstruction) were determined using a total station with a non-prism distance meter and their position was compared to the original point cloud with the differences not exceeding 3 cm.
Remote Sens. 2021, 13, x FOR PEER REVIEW 3 of 12 the standard geometric filters (PMF, SMRF, CSF) for filtering UAV-derived SfM data acquired in a steep rugged terrain and evaluate their performance in the original orientation and (ii) after levelling the data into the horizontal position, expected to be more suitable for the operation of these filters. In addition, we aimed to (iii) assess the performance of a principally different CANUPO filter, (iv) investigate the potential synergisms between the CANUPO and geometrical filters by using a sequential combination of these filters and assess the performance of such a combination and (v), if the results permit, to propose a new workflow for the vegetation filtering of such data.

Data for Testing
Data for testing covered an area of 157 × 49 m with an elevation difference of 47 m (see Figure 1 for the aerial view of the study area). Due to the steep character of the terrain, the imagery was acquired with a variable angle of acquisition (perpendicular to the terrain); the UAV was manually piloted and the image acquisition was performed more or less from a constant distance of 15-30 m perpendicular to the terrain. In all, 248 images with a resolution of 4056 × 3040 px were taken. The axis of the images was approximately perpendicular to the terrain (i.e., most images were oblique relative to the vegetation). For georeferencing, 11 ground control points (GCPs) were used, 3 of which were in the upper part of the slope/rocks and the remainder on more accessible spots along the base of the slope. Individual GCPs were georeferenced using a total station connected to the coordinate system by the measurement of three distant points determined by the GNSS RTK receiver. This method was used because, in view of the terrain characteristics, a large part of the sky was obstructed by the slope, thus reducing the accuracy of the GNSS RTK measurement directly on site due to the unavailability of a sufficient number of satellites. In addition, 30 control points lying on the rocks (without vegetation obstruction) were determined using a total station with a non-prism distance meter and their position was compared to the original point cloud with the differences not exceeding 3 cm.  It is necessary to note that this georeferencing was only performed in view of the use of this point cloud for purposes outside of this study as it had no effect on the evaluation of the algorithms for vegetation filtering (the absolute accuracy of the points within the coordinate system was not assessed). The data were processed in Agisoft Metashape v1.6, cropped to the area of interest and subsampled to achieve a final resolution of 5 cm ( Figure 2) with a final number of 2,520,442 points.
dinate system was not assessed). The data were processed in Agisoft Metashape v1.6, cropped to the area of interest and subsampled to achieve a final resolution of 5 cm ( Figure  2) with a final number of 2,520,442 points.
As the manual collection of GNSS RTK data in the presented type of terrain is virtually impossible (poor accessibility, poor GNSS RTK signal, too many points needed for a detailed description of the rugged terrain), verification data were obtained by a meticulous manual cleanup ( Figure 2). We intentionally opted for data of a relatively poor quality with a higher number of holes and interferences as these are typical of the data used for similar purposes in everyday practice. To provide a better idea of the data character, see the details in Figure 2.

Used Ground Filtering Algorithms
Filters implemented in freely available software solutions were used in this study, namely PMF and SMRF, both of which are included in the pdal software package (https://pdal.io/, accessed on 7 June 2021), and CSF and CANUPO, both included in CloudCompare software version 2.11.3 (www.cloudcompare.org, accessed on 7 June 2021). Below, the basic principles of the individual filters will be briefly described and references to papers containing in-depth descriptions are provided for individual filters.
The geometrical filters (PMF, SMRF, CSF), in general, work on the principle of creating a first iteration of the terrain as a lowest point in a regular or non-regular grid. This provisional terrain model is further refined by various geometric methods with the use of points of an original point cloud. Those points of the original point cloud that do not exceed As the manual collection of GNSS RTK data in the presented type of terrain is virtually impossible (poor accessibility, poor GNSS RTK signal, too many points needed for a detailed description of the rugged terrain), verification data were obtained by a meticulous manual cleanup ( Figure 2). We intentionally opted for data of a relatively poor quality with a higher number of holes and interferences as these are typical of the data used for similar purposes in everyday practice. To provide a better idea of the data character, see the details in Figure 2.

Used Ground Filtering Algorithms
Filters implemented in freely available software solutions were used in this study, namely PMF and SMRF, both of which are included in the pdal software package (https:// pdal.io/, accessed on 7 June 2021), and CSF and CANUPO, both included in CloudCompare software version 2.11.3 (www.cloudcompare.org, accessed on 7 June 2021). Below, the basic principles of the individual filters will be briefly described and references to papers containing in-depth descriptions are provided for individual filters.
The geometrical filters (PMF, SMRF, CSF), in general, work on the principle of creating a first iteration of the terrain as a lowest point in a regular or non-regular grid. This provisional terrain model is further refined by various geometric methods with the use of points of an original point cloud. Those points of the original point cloud that do not exceed a defined distance from the provisional terrain model are then classified as surface points. Detailed principles of individual geometrical filters are described in [9][10][11], respectively; the Remote Sens. 2021, 13, 3050 5 of 11 parameters that can be set for individual filters are shown in Table 1. As the algorithms are constructed for approximately a horizontal ground surface, the optimization of the settings was also performed on levelled data (i.e., data transformed by fitting a plane through the data and the subsequent rotation to make this plane horizontal; this transformation did not change the spatial relationships). Table 1. Tested settings of the filters.

Method
Parameter Values The CANUPO filter uses an altogether different approach, the principle of which was published in [39]. The evaluation is based on the examination of the spherical neighborhood of the near surroundings of the point to which a PCA is applied using the eigenvalues of a matrix constructed from the points from the neighborhood. In this way, this filter is able to well-identify classes with different superficial structures; in our case, these are represented by continuous surfaces and scattered points (such as individual leaves). This is done through a classifier in which the individual neighborhood sizes (dimensions) and the boundaries separating the classes are stored. A higher number of dimensions improves the quality of the classification but increases the computational time. This method works, therefore, quite differently from the other tested algorithms and does not depend on the position or rotation of the objects at all; this might be a great advantage in the situation studied in this paper.

Filter Application on Levelled Data
The steepness of the data profile represents an obvious problem for standard filters because they are (as mentioned) designed primarily for large-scale data where the terrain is essentially flat (horizontal) and vegetation is above it. The transformation of the data into the horizontal position, the performing of the classification and the returning of the already classified data to their original position by an inverse transformation appears to be a relatively simple and logical solution to this problem and should improve the results dramatically. It should be mentioned that for extremely rugged data that cannot be reasonably levelled, it could be necessary to split the data into areas of a similar slope and to "level" and calculate each part separately. The procedure can be applied to any method.

The Proposed New Workflow Combining CANUPO and a Geometric Filter
The CANUPO classification method distinguishes well between surfaces (smooth surfaces, basically 2D formations in space) and vegetation. In principle, the distribution of the vegetation points is to a large degree random, creating clouds of random noise. However, the treetops or shrubs can in a few cases look similar to surfaces and, thus, this classification fails here. This often occurs when evaluating photogrammetric images using the SfM method, the algorithm of which fundamentally creates an assembly of miniature flat areas (this feature can be partially suppressed by using a more detailed resolution). However, these areas are already quite far from the surface of interest (terrain) and can, therefore, be identified quite safely by one of the geometric filters. Therefore, the classification variants where CANUPO was applied to the levelled point cloud followed by a geometric filter and, finally, the cloud transformed back to its original position were proposed and tested in this study in addition to testing individual algorithms.

Filter Quality Evaluation
The testing of the effectiveness of individual (or combined) filters was always performed in the same way: 1.
Application of the particular algorithm on the testing data (data orig ) with chosen settings (resulting in the acquisition of the data filtered ).

2.
Determination of a Type I error by the evaluation of the distance from the filtered (data filtered ) to the reference manually cleaned (data ref ) point cloud in CloudCompare.

3.
Determination of a Type II error by evaluating the distance from the manually cleaned reference data (data ref ) to the data filtered in CloudCompare.
Steps 1-4 are repeated for all tested filters and settings variations and the resulting evaluation criteria are compared.
A Type I error was calculated by deducting the distances from the data filtered to the data ref . Where there were "holes" in the reference data, the distance was calculated to the nearest point of the other cloud. Points with the distance between the two clouds greater than a set value (in our case, this value was, in view of the resolution of 0.05 m, set to 0.1 m) that remained in the cloud although they should have been removed were considered a Type I error. In other words, if the filtered cloud did not have a correct nearest neighbor, the distance was greater than the threshold, which means that such a point should have been removed and was not.
A Type II error was determined by the equivalent algorithm with the only change being the inverse order of the clouds (the distance was calculated from the data ref to the data filtered ). The distance greater than the chosen value meant that the point was eliminated from the cloud during filtering although it should have remained in it. Such points were considered a Type II error. In other words, if the reference cloud did not have a correct nearest neighbor, the distance was greater than the threshold, which means that the respective point was missing in the filtered cloud and was incorrectly removed. For the overall evaluation, a simple summation of Type I and Type II errors was used.
For each filter, the evaluation was performed for all combinations of realistic parameter settings in the particular step and, subsequently, the combination yielding the best results from the perspective of the total evaluation (summation of Type I and Type II errors) was selected; the combinations are presented below. The tested parameter settings are presented in Table 1. For CANUPO filtering, a prm classifier file was created containing definitions of scales and values for classification; 10 scales with a regular step of 0.1 m (0.1; 0.2; 0.3; 0.4; 0.5; 0.6; 0.7; 0.8; 0.9; 1) were used. These tests were performed also for the levelled data. As in the case of the individual filters, the settings were also optimized for each combination of the filters.

Results
The testing results are shown in Tables 2 and 3; as the number of points was high, the effectiveness is described in per cent (the total number of points of the unfiltered point cloud data orig is considered 100%). Although the overall efficiency of the individual algorithms differs greatly when applied to data in the normal (original) position, it can be said that it is, generally, insufficient in all algorithms. The large number of erroneously discarded points (type II error) is a significant problem because important points on the edges, projections and apices (see Figure 3) are erroneously removed. Compared to geometric filters (PMF, SMRF, CSF), CANUPO shows a significantly higher efficiency but still suffers from a high number of points that have erroneously not been removed. This is because this method detects shapes (surfaces) and such surfaces can occur also, for example, in the canopy or tree trunks, thus mimicking a smooth surface of the ground. metrical filter is highly successful in removing the tree trunks and crowns that are at a higher distance from the ground.
The combined methods performed clearly the best with a total error of approximately 1%. The combination of CANUPO and CSF yielded slightly better results than CANUPO combined with SMRF; the combination with PMF yielded still inferior results. A visualization of the filtering success is presented in Figure 3.

Discussion
The commonly used methods for filtering out vegetation from point clouds are based on the basic geometric premise with the object of interest (usually the ground) being at After levelling the data, the errors of the geometrical filters dropped by about half. A significant improvement of the Type II error was also observed but still the results were not of a sufficient quality. Hence, the combination of CANUPO + a geometrical filter was applied to the data. The combination was beneficial as these filters are in a way complementary; the CANUPO filter performs well in removing low vegetation while the geometrical filter is highly successful in removing the tree trunks and crowns that are at a higher distance from the ground.
The combined methods performed clearly the best with a total error of approximately 1%. The combination of CANUPO and CSF yielded slightly better results than CANUPO combined with SMRF; the combination with PMF yielded still inferior results. A visualization of the filtering success is presented in Figure 3.

Discussion
The commonly used methods for filtering out vegetation from point clouds are based on the basic geometric premise with the object of interest (usually the ground) being at the bottom and vegetation on the top. This assumption is true when applying these methods to large-scale airborne LiDAR scans. However, with the increasing availability of point clouds-producing data collection methods, the application space is increasingly shifting towards small-scale industrial and civil engineering applications where the objective is to acquire the scene without intrusive vegetation cover. In such small-scale applications, the cheapest and most widely available method of data acquisition is the SfM processing of UAV-acquired camera imagery, which was also used as a data source in this paper.
In this study, filters, the high effectiveness of which has been reported elsewhere [31] and [32], were used. Nevertheless, as the testing results indicate, these filters are principally unsuitable for steep rugged objects: the PMF filter yielded a total error of 60%, SMRF of 11% and CSF of 16%. The percentage expression should be also accompanied by a note that the erroneously removed data usually represent edges, projections or apices that are important from the application perspective as these pose the greatest danger of release and falling onto the railway track.
A horizontal transformation (levelling), the aim of which was to make the data more suitable for the standard geometrical filtering approaches, improved the results by about half but still did not lead to sufficiently good results (total errors of PMF 30%, SMRF 8%, CSF 6%; see Figure 3). The effectiveness of this approach could be theoretically improved by splitting the data into flatter parts, levelling such data, performing partial calculations, combining the results and re-assembling the fully filtered cloud and inverse transformation to the original position; this approach is, however, fairly laborious and the results are still uncertain. It should be noted that the mechanism of levelling (rotation) results in virtually tilted trees, which might affect the geometrical filtering as the vertical distances of, for example, the lowest branches of the trees or shrubs from the model surface can change, which can affect the performance of the geometrical filters (not of CANUPO as that filter uses a different principle independent of the rotation). However, as it does not alter/warp the spatial relationships within the cloud, this is a minor issue compared to the benefits that the rotation brought to the performance of the geometrical filters so the trade-off is still highly beneficial.
The CANUPO filter performed well in removing the "random" low vegetation close to the terrain but failed to remove much of the trunks and canopy (see Figure 3).
Our assumptions about the synergism of the geometric filters and CANUPO, as detailed in Section 2.4, were proven to be correct, thus confirming the suitability of their combined use (note that CANUPO should be applied before the geometrical filter because if a geometrical filter is used before CANUPO, the result would be practically identical to that obtained using the geometrical filter alone as the rocks would be clipped in the first step). As is obvious from Figure 3, geometrical filters tend to clip the rugged terrain whereas the CANUPO filter tends to correctly identify the terrain (and clean up the immediate vicinity of the terrain) but incorrectly leaves a relatively high amount of vegetation that can be found at a greater distance from the identified terrain. This can be then easily removed by more elastic geometrical filters (i.e., SMRF and CSF) after levelling the CANUPOfiltered point cloud. The experiments with such a combination of filters confirmed this hypothesis, resulting in a total error of only approximately 1% after such double filtering (see Figure 3). Such a result is already sufficient and suitable for most everyday civil engineering applications.
To the best of our knowledge, this is the first paper sequentially combining two filters with different principles for vegetation filtering in a steep rugged terrain with excellent results. This workflow can be highly useful for removing vegetation in all applications requiring a detailed digital terrain model of steep rugged terrain such as the safety evaluation of rock faces in the vicinity of public footpaths in nature reserves, above roads or, as in this case, railway tracks.

Conclusions
This study was the first to optimize and evaluate the performance of four filters available in commonly used software for the vegetation filtering of SfM data acquired from UAV imagery in a steep and rugged rocky terrain. We concluded that (i) standard geometric filters (PMF, SMRF, CSF) do not perform well when applied to such terrain as they excessively remove the terrain irregularities; (ii) after levelling the data by rotating them into the horizontal position, the results improve but still remain fairly poor. This is especially true for the PMF filter, which, due to its low elasticity, is completely unsuitable for such a rugged terrain; (iii) the principally different filter, CANUPO, performed better and, most importantly, (iv) in a complementary way to the geometrical filters. It performed well in removing the low vegetation while failing to remove much of the higher vegetation, thus setting the stage for geometrical filters (SMRF and CSF), which subsequently completed the data cleanup from vegetation with the excellent success of approximately 1% error only. Thus, we can (v) propose that the workflow presented in this paper for vegetation filtering of such difficult terrain, i.e., CANUPO filtering followed by the levelling of the point cloud and using the SMRF or CSF filter for removing the remaining vegetation at a greater distance from the terrain, is a highly suitable approach applicable, for example, in civil engineering for safety evaluations of steep slopes in the vicinity of linear constructions.