Annular Neighboring Points Distribution Analysis: A Novel PLS Stem Point Cloud Preprocessing Algorithm for DBH Estimation

: Personal laser scanning (PLS) has signiﬁcant potential for estimating the in-situ diameter of breast height (DBH) with high e ﬃ ciency and precision, which improves the understanding of forest structure and aids in building carbon cycle models in the big data era. PLS collects more complete stem point cloud data compared with the present laser scanning technology. However, there is still no signiﬁcant advantage of DBH estimation accuracy. Because the error caused by merging di ﬀ erent point cloud fragments has not yet been eliminated, overlapping and inaccurate co-registered point cloud fragments are often inevitable, which are usually the leading error sources of PLS-based DBH estimation. In this study, a novel pre-processing algorithm named annular neighboring points distribution analysis (ANPDA) was developed to improve PLS-based DBH estimation accuracy. To reduce the impact of inaccurately co-registered point cloud fragments, ANPDA identiﬁed outliers through iterative removal of outermost points and analyzing the distribution of annular neighboring points. Six plots containing 247 trees under di ﬀ erent forest conditions were selected to evaluate the ANPDA. Results showed that in the six plots, error reductions of 53.80–87.13% for bias, 38.82–57.30% for mean absolute error (MAE), and 27.17–56.02% for root mean squared error (RMSE) were achieved after applying ANPDA. These results conﬁrmed that ANPDA was generally e ﬀ ective for improving PLS-based DBH estimation accuracy. It appeared that ANPDA could be conveniently fused with an automatic PLS-based DBH estimation process as a preprocessing algorithm. Furthermore, it has the potential to predict and warn operators of potential large errors during hierarchical semi-automatic DBH estimation.


Introduction
Forestry inventory plays an important role in carbon storage estimation, which provides a basis for sustainable management and helps researchers understand terrestrial carbon cycling in the era of big data [1,2]. Of all the forest parameters that can be obtained through nondestructive in-situ measurements, the diameter of the breast height (DBH, 1.30 m height) not only helps researchers understand the structure of a forest but also reflects forest growth state. It contributes to aboveground forest biomass estimation and bridges empirical and process-based models, which directly inform global carbon cycling models [3,4].
For the past few years, an emerging technology known as personal laser scanning (PLS) has shown good potential for in-situ forestry measurements, especially for DBH estimation [4][5][6][7][8][9][10][11][12][13][14]. By the time the concept of PLS was developed, laser scanning technology has been gradually replacing manual DBH measurement due to its high efficiency in collecting accurate data. At present, operational separated cluster removal [8,11]. The efficacy of these methods for the overlapping fragment problem has not been systematically analyzed. To the best of our knowledge, there is currently no algorithm that reduces PLS data-based DBH estimation error due to overlapping, inaccurately co-registered point cloud fragments.
In this study, a novel PLS stem point cloud preprocessing algorithm named annular neighboring points distribution analysis (ANPDA) was developed for the DBH estimation. ANPDA was an algorithm that enhanced circle fitting-based DBH estimation. It reduced the impacts of inaccurately co-registered point cloud fragments by iteratively removing the outermost points from a two-dimensional horizontal projected point cloud. Outliers were identified by analyzing the polar angle distribution of points in the annular neighborhood of the outermost points. The concept of relative entropy was used to quantify the distribution similarity degree and determine the termination criterion for iterative outermost point removal. Points that remained after the critical iteration were grouped and exported as the output point cloud. The methodological contributions of this paper were to (1) support the current PLS-based automatic DBH estimation framework by providing a preprocessing strategy for inaccurate, overlapping point cloud fragments; (2) consider the stem point cloud outlier identification problem based on the polar angle distribution of points in the annular neighborhood; (3) introduce relative entropy to quantify the degree of similarity between point cloud distributions to determine if outliers have been properly removed.

Study Area
The study site of this study was located in Guangxi Gaofeng state-owned forest farm, which is shown in Figure 1. It is a typical subtropical forest farm in southwest China (22.82 • -23.25 • N 108.13 • -108.88 • E). The dominant species of the forest farm are eucalypt and Chinese Fir. Most of the forests grow on the hillside, which makes it difficult for vehicles to enter. Study sites were, therefore, selected in the forest plantation with little understory vegetation to be sure they were accessible for PLS.
Remote Sens. 2020, 12, x FOR PEER REVIEW 3 of 25 there is currently no algorithm that reduces PLS data-based DBH estimation error due to overlapping, inaccurately co-registered point cloud fragments.
In this study, a novel PLS stem point cloud preprocessing algorithm named annular neighboring points distribution analysis (ANPDA) was developed for the DBH estimation. ANPDA was an algorithm that enhanced circle fitting-based DBH estimation. It reduced the impacts of inaccurately co-registered point cloud fragments by iteratively removing the outermost points from a twodimensional horizontal projected point cloud. Outliers were identified by analyzing the polar angle distribution of points in the annular neighborhood of the outermost points. The concept of relative entropy was used to quantify the distribution similarity degree and determine the termination criterion for iterative outermost point removal. Points that remained after the critical iteration were grouped and exported as the output point cloud. The methodological contributions of this paper were to (1) support the current PLS-based automatic DBH estimation framework by providing a preprocessing strategy for inaccurate, overlapping point cloud fragments; (2) consider the stem point cloud outlier identification problem based on the polar angle distribution of points in the annular neighborhood; (3) introduce relative entropy to quantify the degree of similarity between point cloud distributions to determine if outliers have been properly removed.

Study Area
The study site of this study was located in Guangxi Gaofeng state-owned forest farm, which is shown in Figure 1. It is a typical subtropical forest farm in southwest China (22.82°-23.25°N 108.13°-108.88°E). The dominant species of the forest farm are eucalypt and Chinese Fir. Most of the forests grow on the hillside, which makes it difficult for vehicles to enter. Study sites were, therefore, selected in the forest plantation with little understory vegetation to be sure they were accessible for PLS.  To test whether ANPDA could be used for PLS-based DBH estimation under different forest conditions, six plots with a total of 247 trees of different species were selected. Each plot had a fixed size of 25 m × 25 m. Three forest conditions with different degrees of data acquisition difficulty were included to test the adaptability of ANPDA. Plots 1 and 2 had flat and smooth surfaces, which allowed the PLS operator to walk naturally at a steady pace. Plots 3 and 4 were on a machine-graded hillside that made them steep but with a relative tidy slope surface. While collecting data, the PLS operator maintained a low center of gravity for balance at a slower and less steady pace. Plots 5 and 6 were on a hillside with soft and moist soil. While collecting data, extra care had to be taken to avoid slipping. As a result, the sensor might have experienced sudden displacement and spinning.
The forest conditions in the plots are shown in Table 1.

Data Acquisition
To facilitate data acquisition, the experiment was conducted during a less rainy season. All data in this study were collected in January 2018.

Point Cloud Data
The PLS applied in this study was a Heron AC-1 Color wearable mapping system. It was controlled with a tablet and carried on the operator's back while working. The technical specifications are shown in Table 2 [38]. For point cloud data acquisition of each plot, the PLS operator walked in an approximate figure-eight shape, which is often used in outdoor point-cloud-based modeling. As shown in Figure 2, the operator began and ended at the center to make a closed route. The practical walking path should Remote Sens. 2020, 12, 808 5 of 24 be fixed by the operator on the spot with all the stems covered inside the route. After confirmation, the point cloud data was exported and clipped by the operator. While collecting data, rough co-registration was generated in real-time. When the scanning was finished, a more precise co-registration was implemented, which took a few minutes for each plot. The average point density of each plot was 21,028 points/m 2 .
Remote Sens. 2020, 12, x FOR PEER REVIEW 5 of 25 registration was generated in real-time. When the scanning was finished, a more precise coregistration was implemented, which took a few minutes for each plot. The average point density of each plot was 21,028 points/m 2 . Figure 2. The data collection route using personal laser scanning (PLS). Table 3 summarizes the DBH and density of trees in each plot. The DBH of the stems varied from 7.1 cm to 30.4 cm, with a density of 560 to 832 stems/ha. Each stem was indexed, and the DBH was manually measured using diameter tape. The position of each stem in each plot was then measured with the total station according to the index. These positioning data were collected to match the manually measured DBH with the stem point clouds, so positioning errors at the decimeter level were acceptable. The impact of the stem diameter for positioning was, therefore, ignored. Reference positioning maps of stems were then generated with DBH according to the index.   Table 3 summarizes the DBH and density of trees in each plot. The DBH of the stems varied from 7.1 cm to 30.4 cm, with a density of 560 to 832 stems/ha. Each stem was indexed, and the DBH was manually measured using diameter tape. The position of each stem in each plot was then measured with the total station according to the index. These positioning data were collected to match the manually measured DBH with the stem point clouds, so positioning errors at the decimeter level were acceptable. The impact of the stem diameter for positioning was, therefore, ignored. Reference positioning maps of stems were then generated with DBH according to the index.

Essential Data Processing before Implementing ANPDA
ANPDA is a pre-processing algorithm, dealing with the two-dimensional horizontal projected point clouds of stems at breast height. It is essential to generate projected point clouds before applying ANPDA. For each plot, the points of ground were first identified. A digital elevation model (DEM) of the plot was generated and rasterized. Based on the rasterized DEM, a 2 m-height surface of the plot parallel to the DEM surface was generated. The point cloud was clipped to retain the points between the two surfaces. Clustering analysis was then implemented. The clusters were regarded as stems, and the horizontal cluster center was approximately the horizontal stem center. The identified stem point clouds were then clipped into vertical slices. Considering the possible bias of the horizontal stem center and the tree density of the plots, the side length of the vertical clipping square was set at 2 m. Based on prior research [19], the vertical point cloud slice thickness was set to be 20 cm to guarantee circle fitting accuracy. The point cloud for each stem was clipped with only 2 m × 2 m × 0.2 m cubic space remaining. The center of the cubic space was 1.3 m above the DEM surface, with the same horizontal coordinate as the coordinate of the horizontal cluster center for each stem. The point cloud data were then projected onto the horizontal plane for circle fitting. Finally, the projected point cloud data for identified stems were indexed and recorded.
To simulate automatic DBH estimation processing, two software packages were used in this study. As shown in Figure 3, the procedures from ground point identification to stem identification were conducted with Computree [39]. Computree performs batch processing for forest point cloud data in a pre-determined order determined by the user. The algorithms of the applied plugins have been described in [40]. As a supplement, point cloud slicing and projection were implemented with MATLAB [41]. The positioning maps of identified stems were compared with the reference positioning maps to match projected point clouds with the reference data manually.

Essential Data Processing before Implementing ANPDA
ANPDA is a pre-processing algorithm, dealing with the two-dimensional horizontal projected point clouds of stems at breast height. It is essential to generate projected point clouds before applying ANPDA. For each plot, the points of ground were first identified. A digital elevation model (DEM) of the plot was generated and rasterized. Based on the rasterized DEM, a 2 m-height surface of the plot parallel to the DEM surface was generated. The point cloud was clipped to retain the points between the two surfaces. Clustering analysis was then implemented. The clusters were regarded as stems, and the horizontal cluster center was approximately the horizontal stem center. The identified stem point clouds were then clipped into vertical slices. Considering the possible bias of the horizontal stem center and the tree density of the plots, the side length of the vertical clipping square was set at 2 m. Based on prior research [19], the vertical point cloud slice thickness was set to be 20 cm to guarantee circle fitting accuracy. The point cloud for each stem was clipped with only 2 m × 2 m × 0.2 m cubic space remaining. The center of the cubic space was 1.3 m above the DEM surface, with the same horizontal coordinate as the coordinate of the horizontal cluster center for each stem. The point cloud data were then projected onto the horizontal plane for circle fitting. Finally, the projected point cloud data for identified stems were indexed and recorded.
To simulate automatic DBH estimation processing, two software packages were used in this study. As shown in Figure 3, the procedures from ground point identification to stem identification were conducted with Computree [39]. Computree performs batch processing for forest point cloud data in a pre-determined order determined by the user. The algorithms of the applied plugins have been described in [40]. As a supplement, point cloud slicing and projection were implemented with MATLAB [41]. The positioning maps of identified stems were compared with the reference positioning maps to match projected point clouds with the reference data manually.

ANPDA
In this section, an overview of ANPDA is given first, which describes the rationale behind ANPDA. A flow chart of how ANPDA works is also shown and explained in the first subsection. The technical details of each step are described in the following subsections.

Overview
To understand how ANPDA works, a basic understanding of the error is essential. Under ideal conditions, the stem surface is approximately a cylinder, which means the projection should be approximately a circle. Figure 4 shows a PLS-based horizontal projected stem point cloud with good co-registration. The projected point cloud is scattered in the shape of the ring. It is sparse in the inner and outer layers and dense in the middle.

ANPDA
In this section, an overview of ANPDA is given first, which describes the rationale behind ANPDA. A flow chart of how ANPDA works is also shown and explained in the first subsection. The technical details of each step are described in the following subsections.

Overview
To understand how ANPDA works, a basic understanding of the error is essential. Under ideal conditions, the stem surface is approximately a cylinder, which means the projection should be approximately a circle. Figure 4 shows a PLS-based horizontal projected stem point cloud with good co-registration. The projected point cloud is scattered in the shape of the ring. It is sparse in the inner and outer layers and dense in the middle.  The point cloud fragments with inaccurate and good co-registration are overlapping. There is no exact boundary between them. Thus, it is difficult to filter out the overlapping inaccurate coregistered fragments. In those inaccurately co-registered fragments, some points are far from the real stem surface, which usually leads to significant DBH overestimation for algorithms based on circle fitting. The developed ANDPA method identified the bulgy points as outliers before removing them for a better DBH estimation. These outliers usually distribute far from the fitting circle center. If there are any outliers outside the fitting circle, they are most probably the farthest from it. To describe such points, the concept of the outermost point is involved. Here, we defined the outermost point of the projected point cloud as the point farthest from the fitting circle center. Figure 5c shows an example of the identification of the outermost point.  Figure 5a,b shows a PLS-based horizontal projected stem point cloud with inaccurately co-registered fragments. In the point cloud with incorrect co-registration, some fragments bulge out of the surface. The projected point cloud looks like a ring with some crescent fragments inserted on it. The point cloud fragments with inaccurate and good co-registration are overlapping. There is no exact boundary between them. Thus, it is difficult to filter out the overlapping inaccurate co-registered fragments. In those inaccurately co-registered fragments, some points are far from the real stem surface, which usually leads to significant DBH overestimation for algorithms based on circle fitting. The developed ANDPA method identified the bulgy points as outliers before removing them for a better DBH estimation. These outliers usually distribute far from the fitting circle center. If there are any outliers outside the fitting circle, they are most probably the farthest from it. To describe such points, the concept of the outermost point is involved. Here, we defined the outermost point of the projected point cloud as the point farthest from the fitting circle center. Figure 5c shows an example of the identification of the outermost point.
ANPDA iteratively analyzes and removes outermost points to filter outliers, which was inspired by lathing. An iterative analysis is first implemented. In each iteration, the outermost point is analyzed and removed from the projected point cloud. After that, the projected point cloud and the fitting circle center is updated. According to the results of the iterative analysis, the ideal number of iterations for removing outliers is determined. The points remaining after this iteration are exported as the output. ANPDA iteratively analyzes and removes outermost points to filter outliers, which was inspired by lathing. An iterative analysis is first implemented. In each iteration, the outermost point is analyzed and removed from the projected point cloud. After that, the projected point cloud and the fitting circle center is updated. According to the results of the iterative analysis, the ideal number of iterations for removing outliers is determined. The points remaining after this iteration are exported as the output.
To determine when the iterative point removal process should end, the annular neighborhood of the outermost point was also involved in this study. We defined the distance between the outermost point and the fitting circle center as R. The distance between the inner and outer circle was described as pre-determined thickness value t. The annular neighborhood of the outermost point was an annular area with the center located on the fitting circle center. The outer radius was R, and the inner radius was max((R-t),0). Figure 6 shows an example of the annular neighborhood of the outermost point. To determine when the iterative point removal process should end, the annular neighborhood of the outermost point was also involved in this study. We defined the distance between the outermost point and the fitting circle center as R. The distance between the inner and outer circle was described as pre-determined thickness value t. The annular neighborhood of the outermost point was an annular area with the center located on the fitting circle center. The outer radius was R, and the inner radius was max((R − t), 0). Figure 6 shows an example of the annular neighborhood of the outermost point. Figure 7 shows an example of the polar angle distribution of the points within the annular neighborhood and all the remaining points while iteratively removing the outermost points. All points are divided into groups with equal polar angle range. The probability distribution of the polar angle is used to describe the polar angle distribution. In Figure 7a-c, the outermost point is an outlier. The polar angle distribution of the points within the annular neighborhood tends to be centralized around a certain angle. The distribution is very different from that of the remaining points. In Figure 7d, the outermost point tends to have good co-registration. All the points located farther from the center than this well co-registered point have already been removed. It means most of the outliers have been filtered out. The polar angle distribution of the points within the annular neighborhood is thus relatively uniform and similar to that of all remaining points. The degree of similarity is quite close Remote Sens. 2020, 12, 808 9 of 24 for all the outermost points with good co-registration compared with the outliers. This becomes a distinguishing criterion that is adopted by the algorithm.  Figure 7 shows an example of the polar angle distribution of the points within the annular neighborhood and all the remaining points while iteratively removing the outermost points. All points are divided into groups with equal polar angle range. The probability distribution of the polar angle is used to describe the polar angle distribution. In Figure 7a-c, the outermost point is an outlier. The polar angle distribution of the points within the annular neighborhood tends to be centralized around a certain angle. The distribution is very different from that of the remaining points. In Figure  7d, the outermost point tends to have good co-registration. All the points located farther from the center than this well co-registered point have already been removed. It means most of the outliers have been filtered out. The polar angle distribution of the points within the annular neighborhood is thus relatively uniform and similar to that of all remaining points. The degree of similarity is quite close for all the outermost points with good co-registration compared with the outliers. This becomes a distinguishing criterion that is adopted by the algorithm. The strategy for determining whether the outliers have been adequately removed is based on the characteristics of the two kinds of polar angle distributions mentioned above. In the iterative analysis, with the outermost points iteratively removed, the degree of similarity in each iteration is quantified to show the variability. If there is a trend that the polar angle distributions of points in the annular neighborhood and the remaining points were becoming more similar, it means the number of outliers is still decreasing significantly with the removal of the outermost points. When the degree of similarity is relatively stable, it means the outliers have already been properly removed. Once the critical point between these two states is found, the termination criterion for outermost point removal iterations is determined.
ANPDA was designed based on the ideas described above. The overall flow of the algorithm is illustrated in Figure 8.
As illustrated in Figure  of outliers is still decreasing significantly with the removal of the outermost points. When the degree of similarity is relatively stable, it means the outliers have already been properly removed. Once the critical point between these two states is found, the termination criterion for outermost point removal iterations is determined. ANPDA was designed based on the ideas described above. The overall flow of the algorithm is illustrated in Figure 8. The technical details of each step are further described in the following subsections.

Circle Fitting
The circle fitting algorithm used in ANPDA is the Landau algorithm [42], which was implemented by Sumith [43] and proved to be reliable for DBH estimation based on TLS point cloud data by Chang et al. [19]. The Landau algorithm is a non-iterative linear least square algorithm. The error is defined as the difference between the area of the fitting circle and the area of the circle with the same center and a radius of R ' , where R ' is the distance from the point to the center. By minimizing the squares of error, the radius and coordinates of the fitting circle center are acquired.
Because ANPDA is an iterative algorithm, a non-iterative circle fitting algorithm like the Landau algorithm can effectively reduce computation time.

The Choice of the Annular Neighborhood Thickness Value
To implement the annular neighboring points analysis, an appropriate thickness value for the annular neighborhood is required. When the thickness value of the annular neighborhood is too low, the polar angle distribution of the points in the annular neighborhood can be easily influenced by random error. The fluctuation of the similarity degree would influence the judgment of outliers. When the thickness value is too high, the algorithm is insensitive to the outliers near the good points because they have a similar annular neighborhood. Considering the PLS-based stem point cloud resolution, a thickness value of 0.5 cm was used in this study.

Polar Angle Probability Distribution Analysis
In this study, the polar angle for each point was defined in the polar coordinate system, with the fitting circle center as the origin. The polar angle of each point was made with the positive direction of the x-axis. Figure 7 also shows an example of calculating the polar angle probability distribution of a point cloud. For convenient calculation, the polar angle was converted into a value ranging from [0, 2π). The probability distribution of the polar angle was used to describe the polar angle distribution of the points in a point cloud. All points were divided into equal groups based on the polar angle range. The interval of each group was [0, 2π/N g ), [2π/N g , 4π/N g ), . . . , [2π(N g − 1)/N g , 2π), where N g denotes the number of groups. The probability distribution of the polar angle was recorded in a set as {P 1 , P 2 , . . . , P N g }, where P i denotes the probability of the polar angle of any point in the interval of [2π(i − 1)/N g , 2πi/N g ), which can be calculated by Equation (1): where n i denotes the number of the points with a polar angle ranging from [2π(i − 1)/N g , 2πi/N g ), and n all denotes the number of the points in the current point cloud. In this research, the number of groups was set to be eight.

Distribution Similarity Quantification
To quantify the degree of similarity between the polar angle distribution of the points in the annular neighborhood of the outermost point and all the points in the current point cloud, the concept of relative entropy is used. Relative entropy, also known as Kullback-Leibler Divergence, was first developed by Kullback et al. [44]. Now it is widely used for evaluating information that is lost when a real distribution is substituted by an approximate distribution. From another perspective, it also reflects whether a distribution is approximating another one. In Equation (2), KL(P Q) denotes the relative entropy, P(x), Q(x) are the two probability distributions, and Q(x) is the distribution used to substitute P(x). The relative entropy value is non-negative. When both distributions are the same, the relative entropy value is 0. A higher relative entropy value represents a greater difference between the two distributions. As relative entropy is not symmetric, it cannot be used to measure the distance between the two distributions.
Based on the knowledge of relative entropy, the similarity value of the current point cloud is defined in Equation (3) to quantify the degree of similarity between the polar angle distribution of points in the annular neighborhood of the outermost point and all the points in the current point cloud.
In Equation (3), S denotes the similarity value, while N g denotes the number of groups. P ann (x), P cur (x) denote the polar angle probability distribution of the points in the annular neighborhood of the outermost point and all the points in the current point cloud, respectively. For computational convenience, a natural logarithm is applied. It can be seen from Equation (3) that a smaller similarity value indicates that the polar angle distribution of the points in the annular neighborhood point cloud and the polar angle distribution of the points in the current point cloud are more alike, and vice versa.

The Termination Criterion
ANPDA mainly consists of two parts: distribution analysis and outlier removal. Both the two parts are implemented iteratively. Therefore, the termination criteria are essential.
The aim of iterative distribution analysis is to show the variability of similarity value and find the critical point when the outliers are properly removed. Considering ANPDA's outlier identification strategy, more iterations theoretically provide more data to describe when the similarity value becomes relatively stable after the outliers have been properly removed. Under ideal conditions, the iteration should continue until there are so few points remaining that the polar angle distribution would be easily influenced by random error. However, there is still a type of error that requires practical attention. A projected point cloud is usually in the shape of a ring. Although the outliers outside the outer circle of the ring cause the main error, there are still many points with big error inside the inner circle. During the analysis, the outermost points are iteratively removed. After removing most points with good co-registration, the proportion of the inside error points can be quite high. When the current point cloud consists of insufficient remaining points, it is also not well co-registered. These point clouds are not suitable for describing the characteristics of a good co-registered point cloud and should be excluded from the iterative distribution analysis. Considering the above, the termination criterion of iterative distribution analysis is determined. In this study, a minimum remaining points number of 500 was used to reduce the influence of the inner error points.
To operationalize the outlier identification strategy described above, the termination criterion for iterative outlier removal is determined based on the variability of the similarity value. The critical iteration should be the one when the similarity value becomes approximately stable. For operational convenience, the critical point is determined to be the iteration in which the decline of the similarity value first approaches the average of the similarity value of all remaining iterations. All points removed prior to that iteration are regarded as outliers. The termination criterion of outlier removal can be written as Equation (4): where S i denotes the similarity value of iteration I, and I denotes the number of iterations in the iterative distribution analysis. Figure 9 shows an example of the final iteration for outlier removal and the remaining point cloud.

Evaluation of ANPDA
Improvements in DBH estimation accuracy after applying ANPDA were used to evaluate ANPDA performance. The DBH estimated from projected point cloud data before and after applying ANPDA were compared to determine whether ANPDA was effective. Bias, mean absolute error (MAE), and root mean squared error (RMSE) facilitated the evaluation with the following equations: where n denotes the number of stems, d denotes the estimate, and D denotes the reference. The Landau circle fitting algorithm described in Section 2.4.2 was applied for DBH estimation.
value first approaches the average of the similarity value of all remaining iterations. All points removed prior to that iteration are regarded as outliers. The termination criterion of outlier removal can be written as Equation (4): (4) where Si denotes the similarity value of iteration I, and I denotes the number of iterations in the iterative distribution analysis. Figure 9 shows an example of the final iteration for outlier removal and the remaining point cloud.

Evaluation of ANPDA
Improvements in DBH estimation accuracy after applying ANPDA were used to evaluate ANPDA performance. The DBH estimated from projected point cloud data before and after applying ANPDA were compared to determine whether ANPDA was effective. Bias, mean absolute error (MAE), and root mean squared error (RMSE) facilitated the evaluation with the following equations: Inevitably, some stem point cloud slices were unqualified due to poor automatic stem identification and point cloud segmentation algorithms. A forked stem (crevice or crotch shape stems formed by doubletree trunks) might be identified as a single stem. Some clipped stem point cloud slices were mixed with points of leaves and wrapping branches. Because the main concern of this research was whether ANPDA was effective in reducing the influence of incorrect point cloud fragments registration, stems with point cloud segmentation problems were counted but not used for evaluating ANPDA. The results were then compared with the reference data.
In prior research, the performance of PLS-based DBH estimation was evaluated with the estimation accuracy [5, 6,8,9]. However, the quality of stem point cloud data is various and has a massive impact on estimation accuracy. Evaluating a DBH estimation method just by the final estimation accuracy is not sufficient. This study also considered how effective ANPDA was for the data of different quality. We used the mean squared error (MSE) to evaluate the quality of the horizontal point cloud slices according to how well the horizontal projection fits a circle. MSE is a well-known indicator that has been used to evaluate DBH estimation by former researchers [45]. In this study, it provided a measure of how a projected point cloud could be replicated by the fitting circle using least squares. Based on the error defined by the Landau algorithm, the MSE was derived using Equation (8): where MSE denotes the mean squared error, n p denotes the number of the points in the point cloud, (x i , y i ) is the coordinate of a point, and (x, y) is the coordinate of the center of the fitting circle. To facilitate horizontal point cloud classification, value p, which was the natural logarithm of MSE, had been used. It made the number of samples in each group more even. Value p was calculated using Equation (9): All the horizontal point cloud slices were classified according to the value p. A smaller p means the distribution of the projected point cloud is more similar to a circle, which usually means better data quality. To assess whether ANPDA was reliable for stem-level inventory in each class, the effective rate was also calculated with Equation (10): where n denotes the number of the stems, and n e denotes the number of stems that achieved a smaller estimation error after applying ANPDA. The average error reduction after applying was also calculated to evaluate the efficacy, with Equation (11): where n denotes the number of stems, d denotes the estimate after applying ANDPA, d denotes the estimate without applying ANDPA, and D denotes the reference. The relationship between data quality and the effectiveness of ANPDA has been described based on these results.

The Performance of ANPDA in Different Test Plots
The performance of ANPDA was tested in all six test sites. The scanning time for each plot was less than ten minutes. The processing time of applying ANPDA was about one minute for each plot with an Intel(R) Core(TM) i7-4700MQ CPU @ 2.40 GHz processor. Of the 247 stem horizontal point cloud slices clipped from the identified stems, five had the forked stem problem, while 13 of the horizontal point cloud slices were wrapped by branches and leaves. With the exception of these unqualified stem point cloud slices, the remaining 229 slices did not have huge errors arising from automatic stem identification and segmentation algorithms. All 229 qualified horizontal stem point cloud slices were used for evaluating ANPDA. Figure 10 shows the DBH estimation accuracy before and after applying ANPDA. It could be seen from the results that in all the six test sites, there was a decline in all three accuracy indicators (bias, MAE, and RMSE) that were used to reflect the error. As shown in Figure 10, before ANPDA was applied, the bias of the estimated DBH for the six test sites was distributed around 1.83-9.45 cm, which showed a general trend of overestimation. The MAE and RMSE was 1.84-9.45 cm and 2.30-10.71 cm, respectively. Among the three kinds of working conditions, the estimation accuracy was best in the flat plots with the dry and smooth ground surface and worst for the slopes with the moist and soft ground surface. After implementing ANPDA, the reductions were 1.59-6.61 cm for bias, 0.85-5.41 cm for MAE, and 0.92-5.85 cm for RMSE. Figure 10 shows the DBH estimation accuracy before and after applying ANPDA. It could be seen from the results that in all the six test sites, there was a decline in all three accuracy indicators (bias, MAE, and RMSE) that were used to reflect the error. As shown in Figure 10, before ANPDA was applied, the bias of the estimated DBH for the six test sites was distributed around 1.83-9.45 cm, which showed a general trend of overestimation. The MAE and RMSE was 1.84-9.45 cm and 2.30-10.71 cm, respectively. Among the three kinds of working conditions, the estimation accuracy was best in the flat plots with the dry and smooth ground surface and worst for the slopes with the moist and soft ground surface. After implementing ANPDA, the reductions were 1.59-6.61 cm for bias, 0.85-5.41 cm for MAE, and 0.92-5.85 cm for RMSE.   Table 4 shows the improvement of DBH estimation accuracy in all six test sites. The bias of the estimation was from 0.24-2.84 cm. The reduction rate of the bias was 53.80-87.13% after applying ANPDA, which showed high reliability at the plot level. The reduction rates of MAE and RMSE were 38.82-57.30% and 27.17-56.02%, respectively. It meant that the variation of the DBH estimation around the reference data had decreased. The estimation became more reliable after applying ANPDA. It could be seen from Figure 10 and Table 4 that after applying ANPDA, the bias, MAE, and  Table 4 shows the improvement of DBH estimation accuracy in all six test sites. The bias of the estimation was from 0.24-2.84 cm. The reduction rate of the bias was 53.80-87.13% after applying ANPDA, which showed high reliability at the plot level. The reduction rates of MAE and RMSE were 38.82-57.30% and 27.17-56.02%, respectively. It meant that the variation of the DBH estimation around the reference data had decreased. The estimation became more reliable after applying ANPDA. It could be seen from Figure 10 and Table 4 that after applying ANPDA, the bias, MAE, and RMSE of the DBH estimation results decreased in all six test plots. These results showed good adaptability of ANPDA for different working conditions.

The Performance of ANPDA for Horizontal Stem Point Cloud Slices of Different Quality
After classifying the qualified horizontal point clouds with value p, the performance of ANPDA for point cloud slices of different quality was analyzed. Figure 11 shows that without applying ANPDA, the data of lower quality led to higher DBH estimation error. After applying ANPDA, the difference in the error for all six classes became smaller. There was a decline in all three accuracy indicators for the DBH estimation error after applying ANPDA. Table 5 shows the average error reduction and the effective rate of applying ANPDA. Aside from the class with a p-value of −4 to −3, which only contained three-point clouds, the average error reduction and effective rate of ANPDA decreased as the value of p decreased. This indicated ANPDA performed better for data of lower quality. Aside from the class with a p-value of −9 to −8, which only had six samples, the effective rate of ANPDA was generally higher than 80%. This reflected the reliability of ANPDA for horizontal point cloud slices of different quality at the single tree scale. Table 6 shows the improvement in DBH estimation accuracy for all six classes based on p. The error reductions of 62.79-71.65% for bias, 38.39-69.39% for MAE, and 32.23-68.88% for RMSE were achieved. These results indicated that ANPDA was generally effective for stem point clouds of different quality. Figure 12 shows examples of applying ANPDA on point clouds of different data quality. The blue points represent the points that were removed, while the red points make up the output point clouds. The points of the point clouds with higher p-value usually have a more discrete distribution. After applying ANPDA, the bulged points in point clouds of different quality were excluded, and the surface of the point clouds became smoother and more similar to a circle. Remote Sens. 2020, 12 Table 5 shows the average error reduction and the effective rate of applying ANPDA. Aside from the class with a p-value of −4 to −3, which only contained three-point clouds, the average error    Figure 12 shows examples of applying ANPDA on point clouds of different data quality. The blue points represent the points that were removed, while the red points make up the output point clouds. The points of the point clouds with higher p-value usually have a more discrete distribution. After applying ANPDA, the bulged points in point clouds of different quality were excluded, and the surface of the point clouds became smoother and more similar to a circle.

The Adaptivity of PLS for Forest Stands on the Slope
To test whether ANPDA was widely applicable for practical work, experiments were carried out in the forest plantation on the hillside with typical working conditions in southwest China. The six test plots represented three different working conditions: (1) flat plots with dry and smooth ground surface; (2) plots on a slope with the dry and smooth ground surface, and (3) plots on a slope with the moist and soft ground surface. In prior studies, the DBH estimation error for PLS-based data was usually from 1-4 cm for RMSE [4,[6][7][8][9][10][11][12][13][14]. However, this accuracy level was not always achieved in this study.
Without applying ANPDA, only the data acquired in condition (1) had a relatively good estimation accuracy. The estimation accuracy was best in the walkable flat plots and worst for the slopes with the moist and soft ground surface, which also proved the impact of working conditions for DBH estimation. This is because whether the scanner can work steadily severely impacts the data accuracy. When PLS was working on a slope, especially a slope with the moist and soft ground surface, the PLS operator needed more actions to avoid danger. These actions caused sudden moves and rotations of the sensor, which led to more IMU drift. The DBH estimation accuracy was, therefore, reduced.
In this study, all the test plots were located in the forest stands on the hillside. Walking across the forest was the only non-destructive way to reach the test plot, meaning vehicle-based MLS was not available. The efficiency was also a factor that had been counted. In prior research, Giannetti et al. [12] attempted to compare TLS with PLS. Although multi-scan TLS had higher DBH estimation accuracy (with the RMSE 0.15 cm lower than PLS) using scans of 1-h in total, PLS data obtained almost the same accuracy with only seven minutes of walking. Oveland et al. [13] compared three different ground-based laser scanning methods: TLS, hand-held PLS, and backpack PLS. The backpack PLS had the shortest working time of 16 min, the highest tree detection rate of 87.5%, and the highest DBH estimation accuracy with an RMSE of 2.2 cm. Considering the time and labor caused by repetitively mounting and dismounting instruments on the slope, TLS was also not a good choice in this research. Among all the laser scanning techniques, PLS might be the only one that is adaptive for forest stands in the mountainous area due to its portability and good pass ability. It could have become the most appropriate solution if the DBH estimation accuracy was ensured. However, in this study, the accuracy of direct DBH estimation was not satisfying because of the poor working conditions in which the scanner could not work steadily. According to the results, direct DBH estimation using PLS data did not always lead to reliable DBH estimation accuracy on the slopes. After applying ANPDA, the estimation accuracy was improved and achieved the accuracy level in prior research. The adaptivity of ANDPA for poor working conditions showed good potential for making up the gap of DBH in-situ measurement solutions.

ANPDA for Horizontal Point Cloud Slices of Low Quality
According to the results, ANPDA performed better for data of lower quality. The average error reduction and the effective rate of ANPDA decreased as p decreased. This could be because ANPDA deals with outliers outside the fitting circle and works better when the error source is mainly from outside. For projected point clouds of low quality, the points outside the circle were usually far away from the fitting circle and were a more important error source. When the data quality was sufficiently good, the error from outside was not much higher than the error from other sources, which might have made the efficacy of ANPDA less significant.
DBH estimation usually achieves a lower accuracy using data of lower quality, which can also be inferred from the results. In prior research, some data sets were even excluded from the research because of their low data quality [7]. With the application of ANPDA, relatively reliable results were derived from low-quality data, which confirmed that ANPDA had the potential to reduce the essential manual intervention while estimating DBH. Its adaptivity to low-quality data also made automatic DBH estimation less sensitive to specific abnormal data. While modeling, retaining a more complete data set would make the sample plots more representative as well.

ANPDA for Other Error Sources
Practically, errors are not always caused by co-registration. Sometimes, the error in PLS-based DBH estimation is caused by points that do not belong to the trunks [11]. In this study, ANPDA showed good adaptability for some of these errors. It was also effective for dealing with the occluding objects that were clustered in a specific direction, such as vines, falling bark pieces, independent branches. Nevertheless, for stems surrounded by branches and leaves, the DBH estimation accuracy was still not high enough after applying ANPDA. It should be also noted that ANPDA did not apply to the forked stem problem. This was because ANPDA tends to regard an outermost point as an outlier when the polar angle distribution of the points in its annular neighborhood is different from all the points in the point cloud. Therefore, it helped remove points from occluding objects when the polar angle was distributed around a specific value. However, when the amount of the error points was too big, or they were uniformly distributed around, ANPDA was no more applicable. For forked stems, the center of the stem would be regarded as a point between the two trunks while implementing circle fitting, which led to incorrect estimation. Stem point cloud segmentation algorithm for covering branches and forked stems is suggested in future studies while implementing ANPDA.

ANPDA for Automatic and Hierarchical Semi-Automatic DBH Estimation
To explore the full potential of PLS, automatic data processing was a primary concern in this study. In prior research, a mostly applied processing framework was "ground point identification-stem positioning-breast height point cloud slicing-DBH extraction" [4,5,10,[12][13][14]. Although not all the researchers have declared clearly whether their processing was automatic or not, their attempts have already confirmed this framework to have great potential for automatic DBH estimation. For automatic processing, the algorithmic complexity is crucial. ANPDA has introduced a non-iterative circle fitting algorithm, the Landau algorithm. It has avoided applying an iterative algorithm in another one. The algorithmic complexity is, therefore, reduced from O(n 2 ) to O(n). That's the reason why the computational time for each plot is only about one minute. The experiments in this study showed that ANPDA could be fused appropriately with the current automatic processing framework as a preprocessing module between breast height point cloud slicing and DBH extraction. However, it also reflected the limitations of the current framework. There were still a number of point clouds for which acceptable DBH estimation results were not available, regardless of whether ANPDA was applied, which meant that ANPDA might not be suitable for these types of errors. In the current automatic DBH estimation framework, when the data processing method was not suitable for some of the point clouds, which could not be foreseen ahead of time, the errors were typically so large that even the error from a few stems could be as large as the sum of the errors from all the other stems in the plot. To ensure that the results estimated from these "unqualified" point clouds would not ruin the whole experiment, researchers might need to check all the point clouds by visual interpretation [7]. On the one hand, this is contrary to the original intention of automation, and on the other, when collecting point cloud data for a forest at a scale larger than plot level, the time and labor costs of visual interpretation may not be acceptable.
According to the reasons above, we suggested that the problem could be solved using hierarchical semi-automatic processing, which has also been attempted by Cabo et al. [11]. A semi-automatic filtering module can be added after breast height point cloud slicing. By analyzing the features of the data sets themselves, the algorithm attempts to identify the data sets that could lead to unacceptable large error. Only these data sets would be sent to the human operators for visual interpretation and manual intervention, which may lead to significant labor savings due to improved accuracy. For ANPDA, this feature can be the p-value. As shown in Figure 11, higher p values typically produce larger errors. It is recommended that the relationship between the data quality of horizontal point cloud slices and the final error be explored in future research so that ANPDA can be applied in a more robust and flexible way through hierarchical semi-automatic processing.
In the present DBH estimation framework, ANPDA was more like a remedial measure for making up the accuracy loss caused by inaccurate co-registration of point cloud fragments. Future research should also pay more attention to point cloud fragment co-registration algorithms. Avoiding inaccurate registration and removing low-quality point cloud fragments might be the key to obtaining higher accuracy results.

Outlook
To achieve the optimal performance of ANPDA, more experiments should be taken in future research. Influenced by the laser scanners and co-registration algorithms, the form of PLS point cloud can be various. Unlike the overestimation shown in the results of this study, some instruments, such as the ZEB1 handheld mobile laser scanner, can also cause underestimation [6,12,14]. More PLS instruments should be used to acquire a better understanding of the error and improve ANPDA. The determination strategy of the parameters should be studied in future research to achieve the optimal performance of ANPDA. Iterative points removal from inside to outside can also be attempted as a supplementary.
In prior research, circle fitting-based DBH estimation has been developed into a mature method [7][8][9]11]. However, a number of PLS-based DBH estimation experiments have been conducted using a cylinder fitting approach [4][5][6]10], which has the advantage of taking the spatial structure of the point clouds into consideration. For PLS-based DBH estimation using cylinder fitting algorithms, iterative outermost point removal and polar angle distribution analysis are also suitable in theory. It is highly recommended that the core idea of ANPDA be promoted into cylinder fitting-based DBH estimations.

Conclusions
This paper proposed a novel PLS-based DBH estimation preprocessing algorithm, named ANPDA. ANPDA aimed to reduce the impact of overlapping inaccurately co-registered point cloud fragments on DBH estimation. It identified outliers using polar angle distribution analysis of the points in the annular neighborhood. The processed point cloud was then used for circle fitting-based DBH estimation methods. This algorithm supported the current automatic PLS-based DBH estimation framework and provided a new idea for PLS point cloud preprocessing.
Results showed that the performance of ANPDA was better for point cloud data acquired in poor working conditions (e.g., slopes with soft and moist ground surface) or stem point clouds of lower quality. ANPDA was also adaptive for dealing with occluding obstacles clustered in a specific direction, such as vines, falling bark pieces, independent branches. Forked stems and stems covered by dense branches and leaves should be well segmented before applying ANPDA. It appeared that ANPDA could be conveniently fused with an automatic PLS-based DBH estimation process as a preprocessing algorithm. Furthermore, it has the potential to predict and warn operators of potential large errors during hierarchical semi-automatic DBH estimation.