A 3 D Point Cloud Filtering Method for Leaves Based on Manifold Distance and Normal Estimation

Leaves are used extensively as an indicator in research on tree growth. Leaf area, as one of the most important index in leaf morphology, is also a comprehensive growth index for evaluating the effects of environmental factors. When scanning tree surfaces using a 3D laser scanner, the scanned point cloud data usually contain many outliers and noise. These outliers can be clusters or sparse points, whereas the noise is usually non-isolated but exhibits different attributes from valid points. In this study, a 3D point cloud filtering method for leaves based on manifold distance and normal estimation is proposed. First, leaf was extracted from the tree point cloud and initial clustering was performed as the preprocessing step. Second, outlier clusters filtering and outlier points filtering were successively performed using a manifold distance and truncation method. Third, noise points in each cluster were filtered based on the local surface normal estimation. The 3D reconstruction results of leaves after applying the proposed filtering method prove that this method outperforms other classic filtering methods. Comparisons of leaf areas with real values and area assessments of the mean absolute error (MAE) and mean absolute error percent (MAE%) for leaves in different levels were also conducted. The root mean square error (RMSE) for leaf area was 2.49 cm2. The MAE values for small leaves, medium leaves and large leaves were 0.92 cm2, 1.05 cm2 and 3.39 cm2, respectively, with corresponding MAE% values of 10.63, 4.83 and 3.8. These results demonstrate that the method proposed can be used to filter outliers and noise for 3D point clouds of leaves and improve 3D leaf visualization authenticity and leaf area measurement accuracy.


Background
Leaf parameters, including color features and morphological traits such as leaf length, leaf initiation angle, leaf width, leaf thickness and leaf area, are valuable information about plants.Phenotype measurement and classification using 2D images and 3D models based on leaves have become popular topics in recent years [1][2][3].With the rapid development of laser scanner, remote sensing is widely used in 3D reconstruction and recognition in different fields due to the advantages of non-contact, high precision and high efficiency [4].In the past few decades, 3D reconstruction from point clouds has gained great attention, especially for parameter measurement for different types of trees [5][6][7].Yun et al. [8] proposed a leaf area measurement method based on 3D point cloud reconstruction using ground-based light detection and ranging (LiDAR).First, they utilized a support vector machine (SVM) to separate various tree organs from point clouds.Then, the moving least squares (MLS) method was adopted to remove ghost points.Finally, the leaf area was calculated by 3D surface reconstruction using a triangulation algorithm.Bailey et al. [9] developed a semidirect tree reconstruction method using terrestrial LiDAR point cloud data.They developed a digital tree model including the position, orientation and size of every leaf.However, point cloud sources are easily contaminated by noise and outliers [10].To improve the authenticity of 3D visualization and the validity of parameter measurement, an effective outliers and noise filtering method is needed.
In recent years, many filtering methods have been proposed for 2D images and 3D models.Depending on the source, noise can be divided into internal and external noise; while according to the relationship with signals, noise can be divided into additive and multiplicative noise.The most popular classification is based on the probability density function of images, which classifies noise into Gaussian noise, Gamma noise, Rayleigh noise and so forth.Gaussian noise [11], which the probability density function is a normal distribution, is the most common and classical forms of noise.However, for 3D point clouds, noise models, especially for outlier models, are difficult to obtain.
In the past few decades, the development of robust point cloud denoising algorithms has received extensive attention.The goal of such algorithms is either to remove or to reduce the noise in the data while preserving sharp features on the original point cloud surface.Research in the field of digital image filtering has been adapted for point cloud filtering algorithms but it is not direct due to the irregularity, shrinkage and drifting of point clouds.In recent years, a number of filtering methods for 3D point cloud have been developed, such as data clustering [12,13], density-based function [14,15], principal component analysis (PCA) [16][17][18], locally optimal projection (LOP) [19,20], MLS [21,22], nonlocal methods [23,24] and partial differential equations (PDEs) [25,26].Zaman et al. [14] proposed a point cloud denoising method based on a kernel density function.First, the density of the input data points was estimated using kernel density estimation and the particle swarm optimization (PSO) technique was employed to automatically approximate the optimal bandwidth of the kernel density.Then, a mean-shift-based clustering technique was used to remove outliers through a thresholding scheme.The MLS and PCA methods treat outliers as points with large noise and project the noise to an estimated surface.The LOP method iteratively projects a subset of the input point cloud onto the point cloud to reduce noise and outliers.The nonlocal means algorithm (NL-means) is an improvement of the classical mean filtering introduced by Buades et al. [27].In this algorithm, the estimated value for a pixel is computed as a weighted average of related surrounding neighborhoods.However, Mahmoudi et al. [28] considered this method computationally impractical.Thus, they preclassified neighborhoods and reduced the original quadratic complexity to a linear problem to accelerate the algorithm by reducing the influence of less related areas.BM3D [29] image denoising algorithms for point cloud denoising were introduced to achieve good performance and robustness to high levels of noise.These methods exploit the inherent self-similarity characteristics of surface patches in the point cloud to preserve structural details.Prasath [30] used multiple scales contained in digital images to produce anisotropic PDEs, in contrast to previous approaches and the proposed scheme improved both noise removal and detail preservation.PDE-based techniques for filtering 3D point clouds compute partial differential properties [31] and can be considered an extension of techniques for triangular meshes.
However, several problems remain for these outliers and noise filtering methods.(1) Some methods based on statistics require prior knowledge of the input objects, while the underlying distribution is unknown.(2) Some methods are vulnerable to the scale of point cloud data and oversmooth points.(3) Some nonlinear methods are sensitive to the structural properties of the point cloud when estimating accurate point cloud features.(4) Some methods achieve better filtering performance at the expense of time and complexity.Some point cloud processing platforms, such as CloudCompare and MeshLab and universal programming libraries, such as PCL and OpenGL, also provide directly available tools and functions for point cloud filtering.Although using these tools and functions makes filtering and denoising easier and more convenient, new optimal methods are still in insatiable demand.For one thing, these tools and functions are essentially simple algorithms.For example, the SOR filter in CloudCompare uses the average distance of each point to its neighbors and eliminates the points that are farther than the threshold distance.It is an Euclidean distance-based ideology like before.Moreover, these tools and functions are usually strong in applicability and generality, which leads to the lack of pertinence to specific point cloud features.

Objectives
Focusing on the above shortcomings that have plagued filtering performance for point cloud data, the primary goal of this study is to develop a 3D point cloud filtering method for leaves.The method was based on local plane normal features constructed by the manifold distance, with the important premise that outlier and noise have totally different definitions and should be filtered separately.First, the manifold distance was introduced for the detection of outlier clusters and outlier points in each cluster.Then, PCA was employed to estimate the partial surface normal vector to identify noise points.The other objectives of this study are to (1) investigate an appropriate threshold value in the statistical truncation method to optimize the oversmoothing problem, (2) assess the effects of combining the manifold distance and local surface normal estimation for noise detection, (3) balance the relationship between filtering accuracy and computational complexity and (4) validate the authenticity of reconstruction and visualization using the point cloud after filtering and assess the accuracy of leaf parameter estimation.

Data Collection by TLS
A terrestrial laser scanner (TLS, Leica C10) was used to obtain point cloud data.This scanner is a single-point-positioning scanner with high accuracy and precision.It can obtain massive point clouds efficiently and quickly without any intervention.The maximum scanning distance of the Leica C10 is 300 m and the scanning range is 360 • × 270 • (horizontal × vertical).While scanning, the TLS was placed successively on different sides 3m from the target tree.After scanning, each scan from the different angle was integrated into a single coordinate system by using a registration process to acquire full coverage scanning data from the objective trees.The experimental trees were selected on the campus of Nanjing Forestry University (32 • 08 N, 118 • 81 E) and included several species, such as poplar, sakura and Liriodendron chinensis.

Framework of the Proposed Method
This section describes in detail the method we proposed for 3D point cloud filtering for leaves.Partial point clouds were manually segmented from the whole point clouds scanned by the Leica C10.The method in Reference [32] was employed to cluster the partial point clouds and extract well-isolated individual trees from groups.The point clouds of leaves were separated using the tree organ classification method proposed in Reference [8].Taking a poplar leaf as an example, the whole process, from the acquisition of the point cloud to the point cloud after filtering, followed the sequential steps shown in Figure 1a.
The points that were to be filtered contained both outliers and noise, which had totally different definitions in this study.The exact definition of an outlier often depends on hidden assumptions regarding the data structure and the applied detection method [33].Hawkins et al. [34] defined an outlier as an observation that deviates so much from other observations as to arise suspicion that it is generated by a different mechanism.Barnett et al. [35] indicated that an outlying observation or outlier, is one that appears to deviate markedly from other members of the sample in which it occurs.In this study, outliers were defined in a distance-based manner and divided into two categories: outlier clusters and outlier points.The manifold distance was used for distance measurement instead of the traditional Euclidean distance.Outlier clusters are more subject to masking and swamping effects, which are universal phenomena in outlier detection.Acuna et al. [36] provided an intuitive definition of these effects.If an outlier cluster is treated the same as a series of individual outlier points, these circumstances are more likely to occur.Hence, in this method, outlier cluster detection was the first step after the initial clustering.Subsequently, in each remaining cluster, outlier point detection and noise point detection were successively processed.At last, combining all remaining points and the point cloud after filtering was obtained.The detailed work program flow of the proposed method is shown in Figure 1b.

Adapted Truncation Method
The truncation method is a technique used to filter abnormal values in a data set.For a given data set i q , ( ) , which is sorted in ascending order and M is the total number of this data set.The quartiles of this data set divide it into four equal parts, each of which contains 25 percent of the data.The numbers at the first quorum and the third quorum are lower quartile

Adapted Truncation Method
The truncation method is a technique used to filter abnormal values in a data set.For a given data set q i , (i = 1, 2, . . ., M), which is sorted in ascending order and M is the total number of this data set.The quartiles of this data set divide it into four equal parts, each of which contains 25 percent of the data.The numbers at the first quorum and the third quorum are lower quartile Q 1 and upper quartile Q 3 , respectively, which are expressed as follows: and The specific procedures of the truncation method to filter abnormal values in the data set are as follows: First, calculating Q 1 and Q 3 using Equation (1) and Equation (2); Second, calculating the interquartile range R iq , which can be formulated by Third, calculating the lower truncation T L and upper truncation T H according to the following formula: and while µ denotes the threshold value of the truncation method.q i < T L and q i > T H are defined as the abnormal values of this data set.
In the universal truncation method, the threshold value µ usually equals 1.5.Although using quartiles as numerical characteristics makes the method robust, it is more suitable for data with a regular distribution, such as a normal distribution.Most abnormal values would be omitted by a smaller T L and larger T H . Hence, it was inappropriate for our point cloud data.A more suitable threshold value is given in this study.

Measuring Manifold Distance
Points on adjacent positions have high similarities.However, the Euclidean distance can measure only local congruence instead of that of the whole point cloud.Figure 2a presents the two connection modes of two points in a point cloud from the front view and Figure 2b shows the horizontal view.Clearly, the Euclidean distance in the red line between two points cannot reflect the whole manifold similarities well, while the connection mode in blue, which we call the geodesic curve, better reflects the similarities between these two points.A geodesic curve in differential geometry is defined as follows: for any curve on a surface, the geodesic curvature of each point on it is zero.All points on the geodesic curve are on the surface and the distance between two points is shorter than any other paths between these two points.This distance, which we call the shortest path, more accurately reflects the similarities between two points on a surface.The shortest path problem is broad but the variable weight in this study refers to the distance between two points.
For a sample point cloud P, P i ∈ P and P j ∈ P, (i, j = 1, 2, . . ., N), N is the number of points and two methods can be used to find the shortest path between P i and P j .One method is to go directly from P i to P j ; the other is to go to P j through point P k ∈ P, (k = 1, 2, . . ., N), as depicted in Figure 2c, which is the local enlarged point cloud of Figure 2a,b.Defining d P i , P j and d P i , P j as the distance and the shortest distance between P i and P j , we obtain d (p i , p j ) = d(P i , P j ) d(P i , P j ) < d(P i , P k ) + d(P k , P j )  Common solutions of the shortest path problem include the Floyd, Dijkstra and Bellman-Ford algorithms [37].Among these algorithms, the traditional Floyd algorithm is globally optimal and the main program is simple, easy to implement.Therefore, the Floyd algorithm was used in this study.First, for points ( )

Constructing the adjacency matrix
) , the traditional Euclidean distance is defined as and for any positive integer k ,   can be determined, where im P represents the k nearest neighbor of point i P .
Second, initializing the adjacency matrix is as follows: ( ) Third, the distance matrix N N  D is iterated continuously using Equation ( 6) and ( 8) while traversing all points in the data set.Until the iteration terminates, each element ( ) is represented as the manifold distance between points i P and j P .
However, the Floyd algorithm, as discussed above, essentially proceeds through a triple loop.
Excessive iteration leads to computational complexity and requires an extremely large amount of storage, especially for a large number of nodes.Unfortunately, the point cloud scanned by TLS are usually numerous and in high density.Two improvements of the Floyd algorithm were proposed by Wei [38].The first improvement is the construction of an iterative matrix in which all nodes are initially compared; nodes that are not related to the results are removed to reduce the frequency of iteration when searching for the next node.Second, a serial number matrix is constructed to record Common solutions of the shortest path problem include the Floyd, Dijkstra and Bellman-Ford algorithms [37].Among these algorithms, the traditional Floyd algorithm is globally optimal and the main program is simple, easy to implement.Therefore, the Floyd algorithm was used in this study.
Constructing the adjacency matrix D N×N and iterating D N×N to approximate the manifold distance by the traditional Euclidean distance are the key steps in the Floyd algorithm.Using the two methods mentioned previously in the lemma, the main procedures are as follows: First, for points P i (x i , y i , z i ) and P j x j , y j , z j , the traditional Euclidean distance is defined as and for any positive integer k, ϕ i = {P im }, (m = 1, 2, . . ., k) can be determined, where P im represents the k nearest neighbor of point P i .Second, initializing the adjacency matrix D N×N , D(i, j) is as follows: Third, the distance matrix D N×N is iterated continuously using Equation ( 6) and ( 8) while traversing all points in the data set.Until the iteration terminates, each element D(i, j) is represented as the manifold distance between points P i and P j .
However, the Floyd algorithm, as discussed above, essentially proceeds through a triple loop.Excessive iteration leads to computational complexity and requires an extremely large amount of storage, especially for a large number of nodes.Unfortunately, the point cloud scanned by TLS are usually numerous and in high density.Two improvements of the Floyd algorithm were proposed by Wei [38].The first improvement is the construction of an iterative matrix in which all nodes are initially compared; nodes that are not related to the results are removed to reduce the frequency of iteration when searching for the next node.Second, a serial number matrix is constructed to record the case of inserting a node in the process of iteration.Aini et al. [39] named another contribution: the rectangular algorithm.This algorithm benefits from a rectangle graphical approach, requires less computational effort and is easier to implement than the Floyd algorithm.However, both of these methods focus on the iteration process itself.Our proposed method deals with the data size directly.The initial clustering is processed at the beginning and under this condition, the method can proceed with the Floyd algorithm for only one cluster at a time.Without removing any points or constructing any new matrices, the computational time is reduced.

Filtering Outlier Clusters
As illustrated, initial clustering is the preprocessing step in this method.The K-means clustering, which is simple and easy to implement, was used to divide the original point cloud P into K clusters, which denoted by R l , (l = 1, 2, . . ., K).K is an integer by rounding up N/100 and N represents the number of points.
To measure how far cluster R l is from the other K − 1 clusters, which denoted by d C (l), the mathematical models are as follows: and where C l is the center of R l , P l i , (i = 1, 2, . . ., N l ) are the points in cluster R l , N l represents the number of points and d c (l, l ) represents the manifold distance between clusters C l and C l .Using the truncation method with the newly defined threshold value µ to calculate the upper truncation T H 1 of data set d C and a cluster is defined as an outlier cluster when its d C (l) is bigger than T H 1 .
The detailed procedures are given as follows: Input: Point cloud P

1.
Divide the original point cloud data into K clusters 2.
Calculate C l of R l using Equation (9) 3.
Calculate d c (l, l ) using the method in 2.4.1 4.
For l = 1 : K Define R l as an outlier cluster and remove R l 9.

Filtering Outlier Points
After filtering the outlier clusters, new clusters without outlier clusters R l , (l = 1, 2, . . ., K ) and points P l i , (i = 1, 2, . . .N l ) in R l are obtained, where K represents the number of remaining new clusters.The manifold distance between point P l i and P l j in cluster R l , denoted by d p l i , l j , can be used to obtain k nearest points of P l i , which is expressed as ϕ l i = P l im , (m = 1, 2, . . ., k).
For each cluster, to measure how far the point P l i is from the other points in this cluster, which denoted by d p (l i ), the mathematical model is: Calculating the upper truncation T H 2 of data set d p (l) and a point is defined as an outlier point in this cluster when its d p (l i ) is bigger than T H 2 .
The detailed procedures are given as follows: Input: R l , (l = 1, 2, . . ., K ), Calculate d p l i , l j using the method in 2.4.1
Calculate the upper truncation T H 2 of d p (l) using Equation (5) and µ 8.
. Define P l i as the outlier point and remove P l i 11.
end for 13.end for Output: Clusters R l without outlier points: P l i ∈ R l , i = 1, 2, . . ., N l 2.5.Noise Detection and Filtering Based on the Normal Feature of Local Surfaces

Local Surface Fitting and Normal Estimation
PCA is a dimensional reduction method used in feature extraction, especially for surface normal estimations.For a given data point P i and its k nearest neighbor set ϕ i = {P im }, the covariance matrix C 3×3 of this local neighborhood is defined as follows: where P i is the mean of ϕ i = {P im } and can be calculated by the following formula: Using singular value decomposition (SVD), C 3×3 can be decomposed into three principal components v 1 , v 2 , v 3 , which are the three eigenvectors corresponding to the eigenvalues λ 1 , λ 2 , λ 3 , respectively, sorted in descending order.The third principal component v 3 corresponds to the direction in which the projected observations have the smallest variance, which means that eigenvector v 3 approximates the normal direction of this local neighborhood.
However, many previous studies have pointed out that PCA is sensitive to outliers and fails to reliably fit planar surfaces in the presence of outliers [40].Hence, many methods that can make PCA results robust have been proposed.Hubert et al. [41] combined robust covariance estimation with Projection Pursuit and proposed RPCA.They claimed that their method yielded accurate estimates for outlier-free datasets and more robust estimates for outlier-contaminated data and had the further advantage of outlier detection [32].Rousseeuw et al. [42] developed fast-MCD in 1999 and Nurunnabi et al. [43] used fast-MCD-based RPCA to fit a planar surface.However, most of the above methods have yielded limited improvements in the definition of the distance weight.Traditional PCA compares Euclidean distances of potential target points to find the local neighborhood of one point.It provides less robust local neighborhood relationships for our point cloud data, which is crucial for deciding which points should contribute to the normal estimations.To increase the accuracy of the results, the manifold distance in D N×N is used in this study to construct the local neighborhood and calculate the normal estimation of each point.

Filtering Noise Points
After filtering outliers, remaining new clusters, R l , (l = 1, 2, . . ., K ) and points P l i , (i = 1, 2, . . ., N l ) in R l , were obtained, while N l represents the new number of points in R l .For each cluster, using the manifold distance to construct k nearest neighbor set ϕ l i = P l im , (m = 1, 2, . . ., k) of each point P l i x l i , y l i , z l i and the angle between this point and its normal direction v 3 l i (v x l i , v y l i , v z l i ) was used to decide whether this point is the noise point in this cluster, which can be described as follows: Calculating the lower truncation T L 3 and upper truncation T H 3 of the θ(l) and points, angles of which are between these two bounds, will be retained.Otherwise, they are removed as noise points.
Calculate v 3 l i of point P l i using Equation (12), Equation ( 13) and SVD
Calculate T L 3 and T H 3 of θ(l) 8.
for i = 1 : Define P l i as a noise point and remove P l i 11.
end for 13.end for Output: Cluster R l without outlier and noise points, P l i ∈ R l , i = 1, 2, . . ., N l

Experiments and Results Analysis
Numerous comparison analyses were performed to evaluate the effectiveness and practicability of the proposed method.Leaves of trees with different heights and diameters at breast height (DBH) were collected as analysis samples, as shown in Table 1.In Section 3.1, the effects of the different threshold values of the truncation method on the recognition rates are compared intuitively.In Section 3.2, the proposed method is compared with two other methods to evaluate the filtering performance of the proposed method.3D reconstruction and visualization results are also presented in this section.In Section 3.3, analyses of leaf area measurement are given to further verify the applicability of the proposed method.0 as the interval.Figure 3 gives a poplar leaf sample.

Filtering Performance Evaluation Based on 3D Reconstruction
The method we proposed has two main innovations: filtering outlier clusters, outlier points and noise points separately and using manifold distance as a metric.To highlight the advantages of these two innovations, the comparison analyses were carried out with other two methods: using the Euclidean distance method and classical PCA method.The Euclidean distance method uses the same procedures as the method we proposed excluding the use of manifold distance and classical PCA is a standard method that neither filters outliers and noise separately nor uses manifold distance.
Leaves of different species and levels were chosen from the collected samples to conduct a comprehensive assessment.3D reconstructions of leaves were obtained by the nonuniform rational B-spline (NURBS) method, which has become a widely utilized approach for surface reconstruction in recent years [37,44].The presence of outliers and noise would mislead the choices of control points The points in blue are the outliers and noise detected by the truncation method.Clearly, when µ = 1.5, the recognition rate is very low.In addition, compared with Figure 3d,e, Figure 3f reveals a better filtering performance.However, when µ = 0.3 in Figure 3g, the point cloud is oversmoothed.The results demonstrated that the threshold value µ = 0.6 was suitable for the leaf point cloud.

Filtering Performance Evaluation Based on 3D Reconstruction
The method we proposed has two main innovations: filtering outlier clusters, outlier points and noise points separately and using manifold distance as a metric.To highlight the advantages of these two innovations, the comparison analyses were carried out with other two methods: using the Euclidean distance method and classical PCA method.The Euclidean distance method uses the same procedures as the method we proposed excluding the use of manifold distance and classical PCA is a standard method that neither filters outliers and noise separately nor uses manifold distance.
Leaves of different species and levels were chosen from the collected samples to conduct a comprehensive assessment.3D reconstructions of leaves were obtained by the nonuniform rational B-spline (NURBS) method, which has become a widely utilized approach for surface reconstruction in recent years [37,44].The presence of outliers and noise would mislead the choices of control points in NURBS, which indirectly influence the fitting performance.Moreover, 3D visualizations of the samples were constructed using VC++ 2015 and OpenGL.
To demonstrate the performance of filtering using different methods, mature leaves were taken as examples.Figure 4 shows the experimental results for a mature poplar leaf.The blue points in Figure 4a-c, which represent outliers and noise found by the filtering methods, reveal that the proposed method generally had better performance in terms of both quality and quantity in filtering, resulting in better reconstruction of the point cloud.The proposed method and method using the Euclidean distance method, all treated outlier clusters and points separately, thereby avoiding masking and swamping effects to some extent, whereas the classical PCA, as shown in Figure 4c,f, had the worst filtering results.Comparison results fully explains why it is necessary to filter outlier clusters and points separately.In addition, the point cloud after filtering in Figure 4d is closer to the shape of the real leaf than that in Figure 4e,f.Hence, the reconstruction and visualization in Figure 4g outperformed that in Figure 4h,i To demonstrate the performance of filtering using different methods, mature leaves were taken as examples.Figure 4 shows the experimental results for a mature poplar leaf.The blue points in Figure 4a,b,c, which represent outliers and noise found by the filtering methods, reveal that the proposed method generally had better performance in terms of both quality and quantity in filtering, resulting in better reconstruction of the point cloud.The proposed method and method using the Euclidean distance method, all treated outlier clusters and points separately, thereby avoiding masking and swamping effects to some extent, whereas the classical PCA, as shown in Figure 4c,f, had the worst filtering results.Comparison results fully explains why it is necessary to filter outlier clusters and points separately.In addition, the point cloud after filtering in Figure 4d is closer to the shape of the real leaf than that in Figure 4e,f.Hence, the reconstruction and visualization in Figure 4g outperformed that in Figure 4h,i.For the mature sakura leaf point cloud, as shown in Figure 5a,b, the method we proposed and the Euclidean distance method yielded almost completely different filtering results because the leaf surfaces of sakura are usually curved.Consequently, using the manifold distance or not has great effects on the sakura leaf point cloud.The results in Figure 5c,f reveal that classical PCA was still not For the mature sakura leaf point cloud, as shown in Figure 5a,b, the method we proposed and the Euclidean distance method yielded almost completely different filtering results because the leaf surfaces of sakura are usually curved.Consequently, using the manifold distance or not has great effects on the sakura leaf point cloud.The results in Figure 5c,f reveal that classical PCA was still not good for filtering sakura leaf point cloud.Moreover, outliers and noise, which were omitted in filtering, directly caused the distortion of the 3D reconstruction in Figure 5h,i    Figure 6 gives the experimental results for a mature Liriodendron chinense leaf.The Liriodendron chinense leaf surface has many corners and edges.Hence, better filtering result was required for the its reconstruction and visualization.As shown in Figure 6e, the method using Euclidean distance omitted four single outliers in the lower left corner of the point cloud and the reconstruction of this leaf is out of shape in this position.Similarly, classical PCA could not handle points in the corners well and thus the reconstruction result in Figure 6i   In addition to mature leaves, tender leaves can also reflect the growth status of trees.Hence, 3D reconstruction results of tender leaves from the three species above were also given in Figure 7.The leaf areas of tender leaves are usually small.Although they have fewer points than mature leaves, the reconstruction and visualization of them are still real after filtering with the proposed method.It proves that the method we proposed can be applied to different leaves of different species and levels.

Leaf Area Measurement
The 3D reconstruction and visualization of the point cloud can be used to estimate many complex leaf traits.Leaf area is one of these traits and the application of the partial mesh area method based on the NURBS fitting surface can be used to calculate the leaf area [45].To further validate the performance of the proposed filtering method, leaf areas of samples of different sizes were calculated and 120 of them, which areas ranged from 5 to 80 cm 2 , were randomly chosen in analyses of leaf area In addition to mature leaves, tender leaves can also reflect the growth status of trees.Hence, 3D reconstruction results of tender leaves from the three species above were also given in Figure 7.The leaf areas of tender leaves are usually small.Although they have fewer points than mature leaves, the reconstruction and visualization of them are still real after filtering with the proposed method.It proves that the method we proposed can be applied to different leaves of different species and levels.In addition to mature leaves, tender leaves can also reflect the growth status of trees.Hence, 3D reconstruction results of tender leaves from the three species above were also given in Figure 7.The leaf areas of tender leaves are usually small.Although they have fewer points than mature leaves, the reconstruction and visualization of them are still real after filtering with the proposed method.It proves that the method we proposed can be applied to different leaves of different species and levels.

Leaf Area Measurement
The 3D reconstruction and visualization of the point cloud can be used to estimate many complex leaf traits.Leaf area is one of these traits and the application of the partial mesh area method based on the NURBS fitting surface can be used to calculate the leaf area [45].To further validate the performance of the proposed filtering method, leaf areas of samples of different sizes were calculated and 120 of them, which areas ranged from 5 to 80 cm 2 , were randomly chosen in analyses of leaf area

Leaf Area Measurement
The 3D reconstruction and visualization of the point cloud can be used to estimate many complex leaf traits.Leaf area is one of these traits and the application of the partial mesh area method based on the NURBS fitting surface can be used to calculate the leaf area [45].To further validate the performance of the proposed filtering method, leaf areas of samples of different sizes were calculated and 120 of them, which areas ranged from 5 to 80 cm 2 , were randomly chosen in analyses of leaf area measurement.Two comparisons were conducted: one was the comparison of the leaf area measurements after filtering using the proposed method and the actual area, as described in Figure 8 and the other one was the comparison of the leaf area measurements after filtering using classical PCA and the actual area, as described in Figure 9.The actual leaf area was obtained by manual measurement using an LI-3000C area meter.As shown in Figure 8, the RMSE of leaf area regression is 2.49 cm 2 and R-square is 0.98.However, the RMSE is 5.02 cm 2 and R-square is 0.95 in Figure 9.The measurement results from the 3D reconstruction of the point clouds after filtering with the proposed method were closer to the actual area values, which also indicated that the proposed filtering method can improve the performance of leaf area measurement.measurements after filtering using the proposed method and the actual area, as described in Figure 405 8 and the other one was the comparison of the leaf area measurements after filtering using classical

406
PCA and the actual area, as described in Figure 9.The actual leaf area was obtained by manual 407 measurement using an LI-3000C area meter.As shown in Figure 8, the RMSE of leaf area regression 408 is 2.49cm 2 and R-square is 0.98.However, the RMSE is 5.02cm 2 and R-square is 0.95 in Figure 9.The

419
The regression formula is measurement.Two comparisons were conducted: one was the comparison of the leaf area measurements after filtering using the proposed method and the actual area, as described in Figure 8 and the other one was the comparison of the leaf area measurements after filtering using classical PCA and the actual area, as described in Figure 9.The actual leaf area was obtained by manual measurement using an LI-3000C area meter.As shown in Figure 8, the RMSE of leaf area regression is 2.49cm 2 and R-square is 0.98.However, the RMSE is 5.02cm 2 and R-square is 0.95 in Figure 9.The measurement results from the 3D reconstruction of the point clouds after filtering with the proposed method were closer to the actual area values, which also indicated that the proposed filtering method can improve the performance of leaf area measurement.
Actual area (cm 2 )  Moreover, another comparison analyses using 120 leaf samples were conducted to further analyze the impacts that the filtering method had on leaves with different areas.These 120 leaves were classified as small leaves (SFs), with areas of 5 cm 2 -15 cm 2 ; medium leaves (MFs), with areas of 16 cm 2 -40 cm 2 ; and large leaves (LFs), with areas of 41 cm 2 -80 cm 2 .The mean absolute error (MAE) and mean absolute error percent (MAE%) of the samples filtered using the proposed method and classical PCA were separately calculated in analyses as follows: Moreover, another comparison analyses using 120 leaf samples were conducted to further analyze the impacts that the filtering method had on leaves with different areas.These 120 leaves were classified as small leaves (SFs), with areas of 5 cm 2 -15 cm 2 ; medium leaves (MFs), with areas of 16 cm 2 -40 cm 2 ; and large leaves (LFs), with areas of 41 cm 2 -80 cm 2 .The mean absolute error (MAE) and mean absolute error percent (MAE%) of the samples filtered using the proposed method and classical PCA were separately calculated in analyses as follows: and where A i and Âi are the actual leaf area and the measured leaf area and n is the number of leaf samples.
For each type of sample, n = 40.Table 2 gave the comparison results of the MAE and MAE% values for samples filtered by the proposed method and classical PCA.Clearly, the MAE and MAE% of the samples filtered using the proposed method are less than those filtered by classical PCA.This difference was more obvious when the area was smaller, especially for SFs.

Comparison with Existing Methods
In this paper, a new method for 3D leaf point cloud filtering was proposed.Above all, as discussed before, the Floyd algorithm suffers from high computational complexity, especially for large-scale point cloud data.Although [38] and [39] improved the Floyd algorithm, they did not deal with the data size directly.In this paper, the original point cloud data were initially clustered and under this circumstance, the Floyd algorithm proceeded with only one cluster with a fixed number of points at one time.In addition, a new filtering strategy was also used in the method.Previously, Sridhar et al. [46] proposed a formulation for outliers based on the distance of a point from its Kth nearest neighbors.They ranked each point on the basis of its distance to its Kth nearest neighbor and classified the top points in this ranking as outliers.Breunig et al. [47] used the local outlier factor (LOF) to measure the outlier degree of an object.Although these methods are practical for finding local outliers, they do not filter outlier clusters and outlier points separately.In this study, after the initial clustering was performed, outlier clusters were filtered and then outlier points were filtered.This strategy weakens the masking and swamping effects of outlier clusters and shows better filtering performance.Moreover, in this study, a manifold distance-based PCA method was applied for noise detection.PCA is a neighborhood-based filtering technique.This technique determines the filtered position of a point using similarity measures between a point and its neighborhood, which has a strong influence on the efficiency and effectiveness of the filtering approach [48].The similarity can be defined by the positions of points, normals or regions [49].Many previous PCA-based methods constructed neighborhoods using the traditional Euclidean distance.However, using the manifold distance to construct a local plane surface and determine which points should contribute to the normal estimations helps achieve more robust features and better filtering performance.

Recommendations
The method we proposed uses an adaptive threshold value in the truncation method to filter abnormal values.Figure 3 shows the filtering performance of this threshold value and the result was satisfactory.In fact, the filtering accuracy of the truncation method for point cloud data depends strongly on the properties of the point cloud data, such as scale and density, rather than the threshold value itself.Although the experimental results proved that the threshold value used in this method was appropriate for leaf point cloud data, for other specific point cloud data, specific analyses and comparisons should be performed to choose the suitable threshold value in the truncation method to ensure the filtering effect.

Conclusions
In this study, a 3D point cloud filtering method for leaves based on manifold distance and normal estimation was proposed and point cloud data were collected using a Leica C10 terrestrial laser scanner.First, the original point cloud was initialized into several clusters and the truncation method with the newly defined threshold value was used to filter outlier clusters.Second, for each new cluster, the outlier points were filtered using the same truncation method and noise points were then filltered by normal estimations and included angles.The truncation method was used again to filter points with singular angles.After traversing all clusters, the filtering process was complete.Three analyses were conducted in this study: an analysis of a suitable threshold value in the truncation method, a comparison of the filtering performance between the proposed method and two other methods and an evaluation of the leaf area measurement.The first analysis proved that the truncation method with the new threshold value was suitable for point cloud data.The second analysis indicated that the proposed method had better filtering performance in terms of both quality and quantity than the method using Euclidean distance and classical PCA.The last analysis demonstrated that the area measurement results for leaves after filtering using the proposed method were closer to the actual leaf areas.Considering these results, the proposed method can be deemed as an improvement of 3D point cloud filtering method for leaves.From a practical perspective, the proposed method can be further used for 3D reconstruction and leaf trait measurement.

Figure 1 .
Figure 1.Framework and program flow chart.(a) Framework of the whole process.(b) Detailed work program flow of the method.

1 Q and upper quartile 3 Q
, respectively, which are expressed as follows:

Figure 1 .
Figure 1.Framework and program flow chart.(a) Framework of the whole process.(b) Detailed work program flow of the method.

Figure 2 .
Figure 2. The connection modes (Euclidean distance in red and manifold distance in blue) of two points in a point cloud (a) from the front view and (b) from the horizontal view.(c) Two methods to find the shortest path between manifold distance by the traditional Euclidean distance are the key steps in the Floyd algorithm.Using the two methods mentioned previously in the lemma, the main procedures are as follows:

Figure 2 .
Figure 2. The connection modes (Euclidean distance in red and manifold distance in blue) of two points in a point cloud (a) from the front view and (b) from the horizontal view.(c) Two methods to find the shortest path between P i and P j .

Figure 3 .
Figure 3. Filtering performance analyses of different threshold values  . (a) Real leaf point cloud. (b) Original point cloud of the part in the red circle in (a).(c) Filtering performance when 5 . 1 =  .(d)

Figure 4 .
Figure 4. Comparison analyses for mature poplar leaf filtering and visualization: the filtering performance of (a) the proposed method, (b) the Euclidean distance method and (c) the classical PCA method; the point cloud after filtering using (d) the proposed method, (e) the Euclidean distance and (f) classical PCA; 3D reconstruction after filtering using (g) the proposed method, (h) the Euclidean distance, (i) and classical PCA.

Figure 4 .
Figure 4. Comparison analyses for mature poplar leaf filtering and visualization: the filtering performance of (a) the proposed method, (b) the Euclidean distance method and (c) the classical PCA method; the point cloud after filtering using (d) the proposed method, (e) the Euclidean distance and (f) classical PCA; 3D reconstruction after filtering using (g) the proposed method, (h) the Euclidean distance, (i) and classical PCA.

.
Remote Sens. 2018, 10, x FOR PEER REVIEW 13 of 19 Remote Sens. 2018, 10, x; doi: FOR PEER REVIEW www.mdpi.com/journal/remotesensinggood for filtering sakura leaf point cloud.Moreover, outliers and noise, which were omitted in filtering, directly caused the distortion of the 3D reconstruction in Figure 5h,i.

Figure 5 .
Figure 5.Comparison analyses for mature sakura leaf filtering and visualization: the filtering performance of (a) the proposed method, (b) the Euclidean distance method and (c) the classical PCA method; the point cloud after filtering using (d) the proposed method, (e) the Euclidean distance and (f) classical PCA; 3D reconstruction after filtering using (g) the proposed method, (h) the Euclidean distance and (i) classical PCA.

Figure 6
Figure 6 gives the experimental results for a mature Liriodendron chinense leaf.The Liriodendron chinense leaf surface has many corners and edges.Hence, better filtering result was required for the its reconstruction and visualization.As shown in Figure 6e, the method using Euclidean distance omitted four single outliers in the lower left corner of the point cloud and the reconstruction of this leaf is out of shape in this position.Similarly, classical PCA could not handle points in the corners well and thus the reconstruction result in Figure 6i has sharper corners.

Figure 5 .
Figure 5.Comparison analyses for mature sakura leaf filtering and visualization: the filtering performance of (a) the proposed method, (b) the Euclidean distance method and (c) the classical PCA method; the point cloud after filtering using (d) the proposed method, (e) the Euclidean distance and (f) classical PCA; 3D reconstruction after filtering using (g) the proposed method, (h) the Euclidean distance and (i) classical PCA.
Figure6gives the experimental results for a mature Liriodendron chinense leaf.The Liriodendron chinense leaf surface has many corners and edges.Hence, better filtering result was required for the its reconstruction and visualization.As shown in Figure6e, the method using Euclidean distance omitted four single outliers in the lower left corner of the point cloud and the reconstruction of this leaf is out of shape in this position.Similarly, classical PCA could not handle points in the corners well and thus the reconstruction result in Figure6ihas sharper corners.

Figure 6 .
Figure 6.Comparison analyses for mature Liriodendron chinense leaf filtering and visualization: the filtering performance of (a) the proposed method, (b) the Euclidean distance method and (c) the classical PCA method; the point cloud after filtering using (d) the proposed method, (e) the Euclidean distance and (f) classical PCA; the 3D reconstruction after filtering using (g) the proposed method, (h) the Euclidean distance and (i) classical PCA.

Figure 7 .
Figure 7.The 3D reconstruction and visualization of (a) a tender poplar leaf, (b) a tender sakura leaf and (c) a tender Liriodendron chinense leaf.

Figure 6 .
Figure 6.Comparison analyses for mature Liriodendron chinense leaf filtering and visualization: the filtering performance of (a) the proposed method, (b) the Euclidean distance method and (c) the classical PCA method; the point cloud after filtering using (d) the proposed method, (e) the Euclidean distance and (f) classical PCA; the 3D reconstruction after filtering using (g) the proposed method, (h) the Euclidean distance and (i) classical PCA.

Figure 6 .
Figure 6.Comparison analyses for mature Liriodendron chinense leaf filtering and visualization: the filtering performance of (a) the proposed method, (b) the Euclidean distance method and (c) the classical PCA method; the point cloud after filtering using (d) the proposed method, (e) the Euclidean distance and (f) classical PCA; the 3D reconstruction after filtering using (g) the proposed method, (h) the Euclidean distance and (i) classical PCA.

Figure 7 .
Figure 7.The 3D reconstruction and visualization of (a) a tender poplar leaf, (b) a tender sakura leaf and (c) a tender Liriodendron chinense leaf.

Figure 7 .
Figure 7.The 3D reconstruction and visualization of (a) a tender poplar leaf, (b) a tender sakura leaf and (c) a tender Liriodendron chinense leaf.

409
measurement results from the 3D reconstruction of the point clouds after filtering with the proposed 410 method were closer to the actual area values, which also indicated that the proposed filtering method 411 can improve the performance of leaf area measurement.

Figure 8 .
Figure 8. Leaf areas derived from point clouds filtered with the proposed method vs. manual

Figure 9 .
Figure 9. Leaf areas derived from point clouds filtered with classical PCA vs. manual measurement.

Figure 8 .
Figure 8. Leaf areas derived from point clouds filtered with the proposed method vs. manual measurement.The regression formula is 331 .0 995 .0 − = x y .

Figure 9 .
Figure 9. Leaf areas derived from point clouds filtered with classical PCA vs. manual measurement.The regression formula is

Table 1 .
Summary of sample tree and leaf parameters.Effects of the Threshold Value µ on Recognition Rates Different threshold value µ resulted in totally different filtering performance for the leaf point cloud.To find a suitable threshold value for our point cloud, this study tested five different values for each leaf sample, with 0.3 as the interval.Figure3gives a poplar leaf sample.
Remote Sens. 2018, 10, x; doi: FOR PEER REVIEW www.mdpi.com/journal/remotesensing3.1.Effects of the Threshold Value  on Recognition Rates Different threshold value  resulted in totally different filtering performance for the leaf point cloud.To find a suitable threshold value for our point cloud, this study tested five different values for each leaf sample, with 3 .
. which indirectly influence the fitting performance.Moreover, 3D visualizations of the samples were constructed using VC++ 2015 and OpenGL.

Table 2 .
Comparison of the MAE and MAE% values for samples filtered with the proposed method and classical PCA.