Separating Tree Photosynthetic and Non-Photosynthetic Components from Point Cloud Data Using Dynamic Segment Merging

Many biophysical forest properties such as wood volume and leaf area index (LAI) require prior knowledge on either photosynthetic or non-photosynthetic components. Laser scanning appears to be a helpful technique in nondestructively quantifying forest structures, as it can acquire an accurate three-dimensional point cloud of objects. In this study, we propose an unsupervised geometry-based method named Dynamic Segment Merging (DSM) to identify non-photosynthetic components of trees by semantically segmenting tree point clouds, and examining the linear shape prior of each resulting segment. We tested our method using one single tree dataset and four plot-level datasets, and compared our results to a supervised machine learning method. We further demonstrated that by using an optimal neighborhood selection method that involves multi-scale analysis, the results were improved. Our results showed that the overall accuracy ranged from 81.8% to 92.0% with an average value of 87.7%. The supervised machine learning method had an average overall accuracy of 86.4% for all datasets, on account of a collection of manually delineated representative training data. Our study indicates that separating tree photosynthetic and non-photosynthetic components from laser scanning data can be achieved in a fully unsupervised manner without the need of training data and user intervention.


Introduction
Forest foliage profile affects the photosynthesis and evapotranspiration processes [1], species competition, wood production, and ecosystem and agro-ecosystem dynamics [2].Many biophysical forest properties such as wood volume [3] and leaf area index (LAI) [4] require prior knowledge on either photosynthetic (e.g., leaf, grass, and flower) or non-photosynthetic (e.g., stem and branch wood) components.For example, understanding the spatial distribution of wood volume contributes to the monitoring of carbon stocks in forested ecosystems [3].Above ground biomass estimation typically requires wood-only parts [5].On the other hand, LAI (m 2 /m 2 ), which is defined as one-half of the total green leaf area per unit ground surface area [6], is a key descriptor of vegetation condition in numerous physiological and biogeochemical studies [7].For a better estimation of leaf area density (LAD) (m 2 /m 3 ), which is defined as the total one-sided leaf area per unit volume, the wood and leaf parts of each tree should be separated [8].Otherwise, the LAI/LAD will be overestimated if woody components were not eliminated.Therefore, it is vital to quantitatively describe forest structures.
Laser scanning, also known as light detection and ranging (lidar), emerges as an innovative technique for nondestructive quantification of forest structures [9][10][11][12].From single tree level to plot level forest studies, terrestrial laser scanning (TLS) is often utilized, while airborne laser scanning (ALS) is usually applicable for region scale studies.TLS is a type of ground-based scanning strategy.It acquires three-dimensional (3D) coordinates in combination with radiometric information of objects.The acquired high-density point clouds enable detailed tree quantification in a nondestructive way [9].TLS has widespread applications in estimating tree attributes, such as stem location [13], diameter at breast height (DBH) [14], basal area and volume [15], above ground biomass [16] and LAD [8].Some efforts have been made to separate photosynthetic and non-photosynthetic components [1,2,8,[17][18][19].However, separating these components in TLS data is still challenging [20].
Overall, existing methods for separating photosynthetic and non-photosynthetic components from point cloud data can be categorized into two groups: intensity based and geometry based.A different third approach uses the combination of radiometric and geometric features [21].Intensity-based methods [2,19] use radiometric information of objects captured by a laser scanner.For forest analysis, the assumption is that wood and leaf components have different optical properties at the operating wavelength of the laser scanner [17].However, these optical properties are influenced by the distance, partial hit, and laser incident angle [22].Therefore, a key challenge of using intensity values is that they have to go through instrument-specific radiometrical calibration before being included in further processing [23][24][25].Calders et al. [23] have shown that the radiometrical calibration carried out for a specific scanner cannot be transferred to another scanner, which greatly limits the applicability of using intensity information for downstream processing.Various authors have made attempts to construct and use multi-wavelength laser scanning to exploit different material reflectance at different wavelengths [26][27][28].
On the other hand, geometry-based methods only use the three-dimensional coordinates of the point cloud captured by a laser scanner, thus giving more potential and usability as point coordinates are the most fundamental information acquired by a laser scanner [17].For separating photosynthetic and non-photosynthetic components, supervised machine learning classification has shown promise and is potentially applicable to any tree point clouds [20].Geometrical features are extracted for each point.Machine learning classifiers such as Support Vector Machine (SVM) [18], Gaussian Mixture Model (GMM) [1,29], and Random Forest (RF) [30] have been employed to classify wood and leaf points.A major drawback of supervised machine learning classification is the requirement of training data.First of all, manual delineation of various components from point cloud data can be extremely tedious.Rendering and manipulating the high density TLS data is also an intensive task for hardware.Furthermore, manual selection of training points is impractical for processing large numbers of trees [20].Second, the spatial distribution of training data greatly impacts the overall performance of machine learning methods.Points on main stems, small branches, leaves, bushes, and so on have to be carefully covered, which impedes the feasibility of manual manipulation.A group of other unsupervised geometry methods [8,17] look at specific geometric properties of certain components.For example, Li et al. [8] separated magnolia leaves by assuming that those leaves were basically flat in surfaces.Tao et al. [17] observed that tree trunk and branch boundaries appear as circles or circle-like shapes.However, these methods were only applied to single trees with multi-scan TLS data coverage.It is unclear if they can be adapted to nature forest scenes and to point cloud data acquired from strategies other than multi-scan TLS such as single-scan TLS and Simultaneous Localization and Mapping (SLAM).
The current study focuses on developing a fully unsupervised approach that is free of user intervention and manual training data, for separating tree photosynthetic and non-photosynthetic components from various point cloud data with varied acquisition sources.The core observation applied in our method is that non-photosynthetic components such as stems and branches appear to be linear at various scales.Therefore, we firstly propose a robust and dynamic point cloud segmentation routine, namely Dynamic Segment Merging (DSM), to partition a point cloud into homogeneous parts.Then, the linear segments are identified by examining segments' feature saliency [1].We test the effectiveness of the proposed DSM method using one single tree dataset, and four plot-level datasets.These datasets cover varied data acquisition strategies, scene complexities, and scanning instruments.One plot additionally features calibrated intensity information.We also compare our results to a supervised random forest model.Another incidental aim of our work is to efficiently process large amounts of points such as forest level data.We show examples of how point cloud structuring can accelerate the process.
In the following, our test data are described in Section 2. In Section 3, the proposed DSM method is explained in detail, together with a brief description of the compared RF method.We give visual and quantitative results in Section 4. In Section 5, we discuss the performances and potential improvements of our method.A general application extension example is also given.Finally, the major findings and conclusions of our work are summarized in Section 6.

Study Data
Five datasets consisting of one single tree set and four plot-level sets are used in this study.These datasets are intended to cover various data sources such as single-scan TLS, multi-scan TLS, and hand-held laser scanning; various tree species including both coniferous and deciduous trees; various terrain conditions such as urban roadside, flat terrains, and steep mountains.Terrain points were removed in advance using the method proposed in Zhang et al. [31], which is also freely available as a plugin in the open-source software CloudCompare (CloudCompare 2.9.1, 2018).One of the plot-level datasets was also radiometrically calibrated, so that we can derive results from calibrated intensity information as well.

Single Tree Data-SD-1
The first dataset represents TLS data of an evergreen sub tropical tree, Erythrophleum fordii, which is provided by Hackenberg et al. [32].The data were acquired in October 2013 from eight scan positions.For our single tree analysis purpose, the acquired point cloud was further manually cleaned, as the tree crown interacts with other trees.Therefore, points from adjacent trees' foliage need to be removed.The cleaned point cloud contains ∼3.9 million points.In this study, we cut out of a section of crown in order to analyze in depth the capability of our method for separating photosynthetic and non-photosynthetic components inside the canopy (Figure 1a).

Plot-Level Data-PD-1
In May 2017, a plot with 50 m radius inside a forest in Großgöttfritz in the federal state of Lower Austria (Austria) was scanned with a TLS Riegl VZ-2000 scanner (RIEGL Laser Measurement Systems, Horn, Austria) (Figure 1b).The measurement was carried from two scanning positions.These two measured point clouds were co-registered later using Riegl's RiSCAN PRO software (RIEGL Laser Measurement Systems, Horn, Austria).The forest consists of mainly conifers (Pinus sylvestris L.) and a couple of silver birches (Betula pendula).A radiometric calibration was performed prior to the present study with a Spectralon of a known reflectivity of 99%.Twenty-five measurements were made with the lidar at distances ranging from 1 to 50 m in order to calibrate the intensity measured by the lidar [22].Approximately 600,000 wood and leaf points were manually selected in order to examine their corresponding intensity values.Consequently, the corrected intensity information is shown in Figure 2. By setting a threshold of 0.78, we separate photosynthetic and non-photosynthetic components for this dataset, besides using the proposed geometry method.

Plot-Level Data-PD-2
This plot is one of a large set of plots provided by the EuroSDR International TLS benchmark project: Benchmarking of Terrestrial Laser Scanning for Forestry Applications [33].These plots were located in a southern Boreal forest in Finland, and were scanned by the Finnish Geodetic Institute (FGI) during the summer of 2014, with a Leica HDS6100 scanner (Leica Geosystems AG, Heerbrugg, Switzerland).Twenty-four sample plots were provided by this project.Each plot features two TLS datasets; a single-scan set from the plot center, and a multi-scan set from the center and four corners.Each plot has a size of 32 m × 32 m.In this study, we randomly select one plot from the single-scan datasets (Figure 1c).

Plot-Level Data-PD-3
TLS data of a plot with 15 m radius were acquired by the Chinese Academy of Forestry (CAF) in April 2017.The plot is located in the Baihuashan National Natural Reserve (Beijing, China).The tree species consists of mainly Dahurian Larch (Larix gmelinii) and white birch (Betula platyphylla).The measurement was carried out with a Trimble TX8 scanner (Trimble Inc., Sunnyvale, CA, USA).Multi-scans were performed from five locations.The resulting point clouds were further co-registered by using the Trimble RealWorks software (Trimble Inc., Sunnyvale, CA, USA) (Figure 1d).

Plot-Level Data-PD-4
This plot is a section of a test dataset from a roadside forest in Wolfsgraben in the federal state of Lower Austria (Austria) (Figure 1e).The main tree species in this plot is European beech (Fagus sylvatica) with a coniferous tree and broad-leaved weeds.The dataset was acquired by a hand-held scanner GeoSlam Zeb-1 (GeoSLAM Ltd., Nottinghamshire, UK).Zeb-1 is a lightweight and hand-held laser scanner which records more than 40,000 measurement points per second.Due to the limitation of its measurement range, the canopy level was barely covered.The purpose of acquiring this dataset was to test the lightweight scanner in forest-related studies, such as location detection, wood-leaf separation, diameter estimation, and stem curve retrieval.

Validation Data
Validation data are needed to test the separation accuracy in this study.We manually classified whole point clouds (Figure 1) except the PD-3 dataset, due to its extremely complex subcanopy and understory structure.Validation data for the PD-3 set were spatially evenly distributed, and covered all class types such as leaf, small branch, main branch, and trunk (Figure 1d).Manual classification was carefully performed in the open-source software CloudCompare (CloudCompare 2.9.1, 2018).In addition, we downsampled the validation points to equal numbers for both non-photosynthetic and photosynthetic components, so that the separation accuracy will not be biased by the category that has more points than the other.

Methods
Our method builds upon the conventional region growing algorithm mentality [34].We intend to semantically segment a point cloud into meaningful parts, so that each segment will only contain points belonging to a unique class.In this way, tree stems and branches are isolated from leaf points.Subsequently, all segments representing stem and branch can be identified by examining the linearity feature saliency of each segment.However, the challenge lies in the fact that forests have extremely complex structures.A segmentation routine has to be robust against irregular point cloud structures, varied point cloud densities, and indistinct boundaries between object classes.The conventional region growing method has difficulties on gradual change regions, which causes under-segmentation [35].This deficiency is also evident in our objective, because tree branches and leaves often will not have distinct boundaries in point clouds.To mitigate such a disadvantage, we develop an over-segmentation-based dynamic merging strategy to segment a tree point cloud semantically and robustly.
Our proposed DSM method initially segments the point cloud into small parts, resulting in over-segmentation.Each small part is defined as a Segment.If these small segments are represented by a single point individually, the resulting representative is also known as a collection of Superpoints (e.g., [36]).The following merging process is a dynamic approach, which is similar to the region merging idea applied in the image process (e.g., [35,37]).We defined a similarity metric which was estimated for each segment and used it for merging similar neighboring segments step by step.Meanwhile, the similarities of neighboring segments are updated after each merging event occurs.Therefore, the merging procedure is dynamic, while the conventional region growing method uses static strategies to test and merge neighboring regions, thus not preserving global properties.

Dynamic Segment Merging
In a point cloud region growing routine (e.g., [38]), a constraint or a combination of constraints has to be defined for deciding whether neighboring points can be merged with the current segment.Such constraints are typically on features such as normal vector variation [38], amplitude density [39], and coordinates of the origin's normal projection on the best fitted plane [40].In this study, we use a simple constraint that is the deviation between the z-component of normal vectors.The z-component of normal vectors represents its spatial orientation relative to the vertical direction (i.e., verticality).In other words, it stands for the angular disparity between a normal vector and the z-axis.For example, the z-component of a normal vector will vary-in absolute value-from 0 for a point located on a vertical surface to 1 for a point on a horizontal surface.This simple criterion is useful to segment tree stem and branches, as they will grow in the similar orientation locally in space.

Normal Vector
Consistent with many other studies (e.g., [38]), our proposed DSM method heavily relies on point cloud normal vectors.In fact, estimation of surface normal vectors is one of the fundamental problems for point cloud analysis [41].Challenges arise from outlier points, non-uniform distributions, and missing points [42].When estimating normal vectors, a group of neighboring points has to be defined.A point cloud can be defined as P = {x i , y i , z i , i = 1..n}; P ⊂ R, x i , y i , z i are the coordinates of a point p i in P. The covariance measures the variation of each dimension from the mean with respect to each other.The eigenvalues of the covariance matrix represent the variation along the direction of the eigenvectors [43].Therefore, the normal vector NV at a point p i ∈ P can be estimated as the eigenvector to the smallest eigenvalue EV i of the covariance matrix [44] given by where p is the barycenter of the K nearest neighbors of the point p i .The z-component of the normal vector is denoted as N z .
The selection of K neighborhoods is often determined by empirical or prior knowledge on the scene, and usually with a fixed size [45].However, many scenes contain structures with diverse sizes.For example, in forests, tree trunks and different levels of branches have varied structure scales.This indicates that multi-scale analysis or adaptive analysis can be useful in better sensing the scene (e.g., [46][47][48]).To examine the effects raised from K, we test two strategies with a fixed K at 10 neighborhoods and an adaptive one.To select adaptive optimal neighborhoods, we follow the approach proposed in Weinmann et al. [45].This approach minimizes an eigenentropy given by across different scales ranging from 9 to 99 nearest neighbors with an increment of 9 to find the optimal neighborhood size.

Initial Segmentation
We firstly partition the point cloud into small segments.Each segment should satisfy the homogeneity requirement and be geometrically simple.Several methods were deployed to overly segment a point cloud, such as spectral clustering [49] and global energy model [36].There are several advantages of overly segmenting the point cloud into small simple and homogeneous segments.First, a segment that contains a group of points carries on more information in describing the geometric characteristics of objects than a single point [37,50].Second, the number of initial segments (e.g., Superpoints) is much fewer than the number of original points in a point cloud, thus accelerating the processing speed [37].Lastly, the initial segments help to identify isolated points.These isolated points are not grouped with other points, meaning that they are local outliers that cannot be merged with surrounding points.Therefore, they can be preliminarily removed as they will also participate in the dynamic similarity tests, which will greatly slow down the processing.However, they cannot be simply removed exhaustively, because locally isolated points may be grouped to a large structure.
In this study, we simply re-allocated each isolated point to the final results by majority voting in a sphere neighborhood.
In this study, we develop a computationally efficient strategy to overly partition the point cloud into geometrically simple segments.For a single point, a sphere is created around the point.The radius of the sphere is relevant to the study data.It should be large enough to contain meaningful geometry properties, but retain computational efficiency as a point cloud can easily contain millions of points.The range query can be exhausted with a large search radius [51].For forest scenes, we use a radius of 25 cm for all datasets in this study.The 25 cm radius is loosely defined, and other numbers should work equally as long as they reflect local geometry and retain the computation efficiency.For example, a sensitivity test with values ranging from 15 cm to 35 cm with an increment of 5 cm showed that the standard deviation of accuracy is only 1.2%.All neighboring points within the sphere are ranked based on the distance to the center point in ascending order.The ranked points are tested orderly if the difference between its N z and that of the center point is smaller than a threshold T (see Section 3.1.3).This procedure is terminated if the current tested point is unqualified.Consequently, all previously qualified points are merged with the center point, thus forming a small segment.All other points are returned to the point cloud.This process is repeated to a new ungrouped point, and continued until all points have a segment ID.The resulting segments are a group of cover sets (Figure 3), similar to that of Raumonen et al. [52].We further note that the outcome of the initial segmentation only serves as the underlying architecture of the DSM algorithm, and does not significantly affect the merging result.Different degrees of over-segmentation can be carried out depending on the dataset and available computation power.

Dynamic Merging
The segment merging processing requires the knowledge of neighboring relations [52].In image processing, identifying neighbor pixels is more straightforward.Nevertheless, it can be cumbersome for a point cloud because a point cloud is unstructured.Usually, a search radius has to be defined to find adjacent points and segments [53,54].In this study, we define the search radius as the 99th quantile of point spacing.Point spacing for a single point is defined as the distance to its nearest point.Consequently, the search radius is adapted to the point cloud itself, without the need of user setup.
For each segment, its adjacent segments can be identified (Figure 4).For efficient computation, we can build an adjacency matrix for all segments.The adjacency relationship of the target segment in Figure 4 is shown in Figure 5a.Initially, the adjacency graph is unweighted.Our DSM method finds the most similar adjacent segment and dynamically updates the adjacency relation with similarity metrics of adjacent regions if the target segment is changed.For this purpose, we define a similarity measure (SL) as where x, y, and z stand for the mean values.As described in Section 3.1.2,the constraint of merging used in this study is the D n z .Therefore, we define the threshold T as 0.1 for dataset PD-1, PD-2, and PD-3, as TLS has higher range accuracy.For dataset SD-1 and PD-4, we set the T as 0.2 to tolerate the noisy data inside the canopy and data generated from the hand-held scanner.We start with the initial segment with the largest size (target segment).Its adjacent segments are identified and ranked by their SL metrics in ascending order by the above-mentioned strategy.If there is at least one adjacent segment with a SL value unequal to 0, meaning that at least one segment is qualified, the adjacent segment with the largest SL is merged with the target segment.Afterwards, the adjacency relation has to be updated since the target segment is grown.The adjacent segments of the original target segment of the merged segment are combined.The similarity metrics SL are re-calculated and ranked against the new enlarged target segment.Furthermore, we constrain that the standard deviation of N z within a segment cannot exceed 80% of the threshold T (Equation ( 3)).The 80% is set empirically.In this way, we ensure the global geometrical homogeneity of each segment.If no adjacent segments can be merged, the procedure is terminated and the segment with the largest size among the rest of the segments again serves as the target segment.The process is traversed over all segments.As a consequence, our proposed DSM method tries to merge the most similar qualified adjacent segment, and adjusts the testing during the whole procedure.3), thus it has 0 weight to the target segment 1.For this example, segment 10 has the largest weight, meaning that it is the most similar qualified segment.Therefore, it will merge with the target segment 1 in this step.

Post Processing
In this study, we use the absolute deviation of N z as a unique constraint for merging the segments of a point cloud.In some cases, several branches can grow closely and with the same vertical orientation (Figure 6a).As a consequence, they will be grouped as a single segment.Although the segment only contains most points from stems and branches, the linearity of the segment will be impeded as multiple branches are merged as one segment.To mitigate such a situation, we further develop a top-shift algorithm to separate multiple branches into individual ones.The top-shift method works similarly to the mean-shift algorithm [55].Instead of iteratively projecting points to their local mean, we project points to their local highest points.The search radius (bandwidth) is the same as the one used in the DSM method, which is the 99th quantile of point spacing.This simple idea will separate points from various branches, because they grow into different directions at some point following the branch structures (Figure 6b).Meanwhile, this top-shift processing also helps to identify wrongly merged points from the DSM procedure (Figure 6b), as they will converge to the local highest, while branch points converge to the top of each branch, respectively.

Segment Feature
The DSM method robustly partitions the point cloud into meaningful segments.For each segment, we estimate its linearity feature saliency to examine if the segment is part of tree stems or branches.Unlike point-wise features (e.g., [30]), segment features are calculated based on all points of the segment.Linearity is defined as (EV 1 − EV 2 )/EV 1 .EV 1 , EV 2 , and EV 3 are three eigenvalues from the covariance matrix (Equation ( 1)), and are sorted in descending order.For segment features, the covariance matrix is formed by all points in a In this study, we recognize segments with linearity values above a threshold TH l and number of points more than TH n as tree stems and branches (non-photosynthetic components of trees).The thresholds TH l and TH n are determined by trial-and-error (Figure 7).

Tile Processing
Point cloud manipulation can be computationally intensive.Data measured from TLS can contain millions to billions of points [11], thus efficient manipulation is important.As used in other software and studies [56,57], we partition the whole point cloud into tiles, and process each tile individually and in parallel.We develop three routines to partition a point cloud; uniform voxels, octree, and 2D tiles (Figure 8).Point cloud voxelization is a commonly used approach to partition a point cloud into local spaces (e.g., [58]).It is fast to generate, and works with satisfaction for a point cloud with homogeneous density distribution.For a point cloud with varied point density, such as data from single-scan TLS measurement, Octree is better at balancing the needed computational resources in each tile.Two-dimensional tiles (Figure 8c) are often deployed for large-scale datasets, such as ALS data [56].By tiling the point cloud, the processing can be run in parallel, and the memory consumption is distributed to each computer worker, avoiding memory overcommitment for a single worker.In each tile, a k-d tree structure is constructed to enable fast neighborhood and range searching.Besides, individually processed tiles are further merged afterwards [57].Figure 9 shows an example of point cloud processing on tiles and merging afterwards.The tile processing greatly facilitates and accelerates point cloud manipulation.

Random Forest Classification
In this study, we additionally apply supervised machine learning classification as a method comparison [1,18].Only point-wise geometric features are used, and the random forest (RF) classifier is explored [59].Previous studies showed that the RF model is particularly effective for separating tree photosynthetic and non-photosynthetic components from point cloud data [21,30].RF is a decision tree-based ensemble learning method that was proposed by Breiman et al. [59].The learned model is a collection of weak models.Multiple decision trees are grown on random subsets of training data.The class determination is based on a majority votes approach.Two necessary foundational parameters need to be specified for the RF model; the number of classification trees n trees and the number of features m f t used at each node.We set the number of trees n trees as 100 [30].Another parameter m f t is determined as √ p, where p denotes the number of input features, as suggested by [59].Thirty-two 3D and 2D geometrical features are calculated for each point (Table 1).Two-dimensional features are calculated from projected grids on the XY plane.Many of these features are defined and used in other point cloud classification studies [60].Training data are manually selected for each dataset.The spatial distribution of training data follows a strategy used in Ma et al. [1].We note that our aim is not to achieve the best performance from the supervised machine learning method.Some optimization strategies such as feature selection, hyper-parameters fine-tuning, and model pruning may further improve the classification performance.However, the deployed RF classier, in combination with our crafted features, has shown promising results in a machine learning benchmark study for wood-leaf separation [30].

Evaluation
The performance of our DSM method and the supervised RF method is evaluated based on three statistical indexes; sensitivity, specificity, and accuracy.Sensitivity measures the correctly classified positive samples (true positive rate, TPR).In this study, it represents the correct rate for non-photosynthetic components such as wood and other points.Specificity gives the true negative rate (TNR), thus it measures the correct rate for photosynthetic components (i.e., leaf points).Accuracy (ACC) gives the overall correctness by where P and N are the number of real positive (non-photosynthetic components) and negative (photosynthetic components) samples.TP and TN are the correctly identified positive and negative samples, respectively.

Single Tree Data
A visual inspection of the separation result from the proposed DSM method for the SD-1 dataset is shown in Figure 10a.We observe that some small branch sections at the bottom were misclassified as leaf points.Some leaf points on the canopy surface were also wrongly labeled as wood points.Nevertheless, the branches inside the canopy were generally separated from leaves successfully.Quantitatively, the overall classification accuracy of our DSM method reached 86.9% with fixed neighborbood sizes, and 88.5% when using adaptive neighborhood sizes, which were higher than that of 82.1% and 83.9% from the supervised RF method (Table 2).The achieved sensitivity from the DSM method were higher than the RF method, while the specificity was of the same class.This indicates that our method is particularly confident in detecting non-photosynthetic components.

Plot-Level Data
Figure 10b-e gives an overview of the separation results for our four plot-level datasets from PD-1 to PD-4.We observe that most non-photosynthetic components were resoundingly separated from photosynthetic components, regardless of their spatial orientations.The overall accuracy for the four plot-level datasets ranged from 78.5% to 89.5% in our DSM method with fixed neighborhood sizes, and from 81.8% to 92.0% with adaptive neighborhood sizes (Table 2).The PD-2 dataset had the worst overall accuracy.Its associated specificity was only 64.8%/73.0%,meaning that many non-photosynthetic components were mislabeled as photosynthetic points.This deduction can also be observed in Figure 10c, in which some high stem points were mislabeled as leaf points, potentially due to the low point density on the top and severe occlusions in the single-scan TLS data.Our DSM method had a sensitivity from 92.2% to 97.9% with fixed neighborhood sizes, and from 90.5% to 96.7% with adaptive ones, which were overall higher than the RF method.However, the supervised RF had significantly higher specificity than the DSM method, except the PD-4 dataset.The average overall accuracy for the DSM and RF method was 85.0% (87.5% with adaptive neighbors) and 85.1% (87.0% with adaptive neighbors), respectively.
The PD-1 dataset contains useful intensity information after radiometrical calibration.The intensity calibration and separation thresholding is described in Section 2.2.As a result, the intensity method achieved an overall accuracy of 87.6%, with a sensitivity of 92.1% and a specificity of 83.1% (Table 2).The performance of the intensity method was similar to our DSM method.

Discussion
In this study, tree photosynthetic and non-photosynthetic components are automatically separated by the proposed dynamic segment merging method.The proposed method is fully unsupervised, thus not requiring any training data and user interventions.The core concept of the DSM method is a robust point cloud segmentation routine.We have shown in this study that the DSM method has been successfully used in separating tree photosynthetic and non-photosynthetic components from point cloud data.To some extent, we solve an ongoing challenge in an unsupervised manner, and overcome the bottlenecks in other methods such as calibrating laser intensities or manually selecting training data [20].In the following subsections, we discuss some vital inputs of our algorithm, its performance, and future applications.At the same time, we address several challenges about the separation of tree photosynthetic and non-photosynthetic components from point cloud data.

Calculation of Normal Vectors
Calculation of normal vectors is crucial in this study.Figure 11 gives an example of normal vectors calculated by the adaptive method, compared to those with fixed neighborhoods for the PD-4 dataset.This dataset was acquired with a hand-held scanner, with a lower position accuracy compared to TLS.Points on the stem surfaces are not smooth due to low accuracy, thus giving a good example of the performance of normal vector estimation.Visually, the normal vectors are more homogeneous on the stems after using the optimal neighborhood selection method.The accuracy for the PD-4 dataset was improved from 85.8% to 87.3%, proving the positive effects in estimating normal vectors (Figure 11).A similar trend was also found for other datasets and for the performances of the supervised RF method.The average overall accuracy for all datasets was improved from 85.0% to 87.5% for the DSM method, and from 85.1% to 87.0% for the RF model.The improvements are more significant for our DSM method.Theoretically, this is expected as our DSM method is a kind of segmentation approach which heavily relies on the quality of normal vectors.The selection of optimal neighborhoods is inherently a multi-scale analysis.The local irregularity is smoothed in a large scale, resulting in more homogeneous normal vectors.Consequently, these branches can be identified, and the specificity was broadly improved.The supervised RF method is a machine learning classification technique that relies more on the feature distinguishability and the classifier mechanism.
However, robust and efficient estimation of normal vectors from complex scenes is still a challenging task.Although methods with adaptive neighborhood searching can be effective in quality, they are computationally intensive.The computation time and memory consumption are significantly higher for large datasets.For example, the computation time is about six times longer for the adaptive neighborbood than the fixed neighborbood in our SD-1 dataset, when using seven CPU threads.This is still a bottleneck for most applications [42].A possible solution is to explore GPU processing using a parallel implementation [42].

Algorithm Performance
The DSM method achieved generally constant performances across different datasets.The standard deviation of accuracy was 4.1% with fixed neighborhood sizes and 3.7% with adaptive sizes.The accuracy was low for the PD-2 dataset (Table 2), mostly because it was acquired with single-scan TLS, which resulted in severe occlusions and shadows in the point cloud, and low point density at distances away from the scanner.Consequently, the branches were not well represented as linear segments in the point cloud, and our DSM method relies on segmenting and detecting linear objects.In addition, we observed that our proposed DSM method achieved higher sensitivity than specificity (Table 2) across all datasets.This indicates a systematic tendency of our method for having confidence in detecting non-photosynthetic components over photosynthetic components.In this study, a low specificity means that many wood points were mislabeled as leaf points, while in general high sensitivity states our method is good at detecting wood points robustly.We note that this can be an effect of the irregularity of branch structures.If a section of a branch is severely deformed in terms of growing orientation, it will not be grouped as one segment as its normal vectors are corrupted.This challenge was partly mitigated when using normal vectors with optimal neighborhoods.The specificity ranged from 73.0% to 87.4%, compared to those with fixed neighborhoods which ranged from 64.8% to 81.5%.
In this study, we also deployed the supervised machine learning method RF as a comparison of our DSM method to the state-of-the-art methods.Overall, the RF method achieved similar accuracy compared to the DSM method.However, it required tedious works to delineate training points.These training points must be representative, as they will greatly influence the outcomes of machine learning methods.For example, if we exclude training points on the bushes for the dataset PD-2, the RF results dropped dramatically.In contrast to our method, the RF achieved better model specificity than sensitivity.Ma et al. [1] used a supervised GMM model to separate photosynthetic and non-photosynthetic components from TLS data.Their overall accuracy for single tree data was 82.6%, and further improved to 99.6% after applying a series of post filters.Yun et al. [18] deployed a SVM model for four single trees.The overall accuracy was from 89.1% to 93.5%.The result from our DSM method on the single tree dataset SD-1 was 88.5%.The RF method had an accuracy of 83.9%.For plot-level datasets, Ma et al. [1] reached a preliminary accuracy ranging from 65.8% to 75.2%, which was further improved to a range from 84.3% to 97.8%.Our DSM results for plot-level datasets were in the same standard (81.8% to 92.0%) compared to the improved results from Ma et al. [1].We note that our implemented RF achieved higher accuracy than the GMM model in Ma et al. [1].A potential explanation is that we crafted more features, and the RF model may be more efficient than GMM for such a task [30].
The proposed DSM method effectively partitions forest point cloud into photosynthetic and non-photosynthetic components.It dynamically adjusts the merging process, while the conventional region growing method uses a static strategy.For complex scenes such as forests, the conventional region growing method has difficulties in distinguishing gradual changing regions.Therefore, it under-segmented trees, as illustrated in Figure 12b.For a fair comparison, we used the same normal vectors and searching radius to compare our DSM method.Figure 12a demonstrates that our DSM method can effectively separate leaf and wood points, where conventional region growing failed.We further show a detailed inspection of PD-3 in a magnified view (Figure 13).We chose to show fine-detail information of the PD-3 plot because its complexity in tree structures is instrumental in testing the performance of our DSM method.In addition, we used a simple constraint in this study-the deviation of z-component of normal vectors (verticality)-to group tree trunks and branches.We have shown that it is effective for irregular branches (Figure 13).It is further noted that the constraint in our DSM method only serves a role to check if an adjacent segment is qualified, while the merging is determined by the similarity ranking (Equation ( 3)).In a nutshell, our DSM method is less sensitive to the selected constraints, which is one other distinction compared to conventional region growing routines.

Challenges in Components Separation
Apart from the methodology shortage, there are still many challenges in separating tree photosynthetic and non-photosynthetic components from a point cloud.Forest structures are multifarious in nature.The acquired point cloud will be affected by severe occlusions.This results in data shallows and non-uniform density distributions.For intensity-based methods [2,19], this effect is less significant.However, geometry-based methods will suffer from these challenges.For machine learning-based methods, the crafted features should cope with non-uniform density, and be robust and distinctive to guide the model.However, this is still challenging for complex scenes.Recently developed deep networks on point clouds may further help to extract higher level features [61,62].Nevertheless, these models are still in the very early development stage, and hardly work with high-density TLS data.
Another bottleneck of TLS data processing is to deal with the large amount of data.With millions and billions of data [11], point cloud manipulation can be intensive.Efficient point cloud management and processing are mainly targeted at ALS data [56].In this study, we briefly implemented tile processing together with the k − d tree structure to accelerate TLS processing.Future optimization is still imperative.

Future Applications of the DSM Method
In this study, we developed an efficient method applied to a wood-leaf separation task.This processing is a crucial step in reducing uncertainties in TLS-derived estimates of above ground biomass [20].Our method can greatly facilitate the automation in this processing.For example, Figure 14 gives an example on how our method was applied to the reconstruction and parameter retrieval of the entire Erythrophleum fordii tree (SD-1).Leaf points were detected and removed automatically by using our DSM method (Figure 14b).Subsequently, the entire tree was reconstructed by the Quantitative Structure Model (QSM) [52].Some vital parameters can then be extracted from the QSM, such as total volume (581.6L),trunk volume (436.6L), and branch length (84.4m).On the other hand, our algorithm can be easily deployed for other applications.For example, the presented pipeline in this study can be adapted to detect tree stems by simply filtering linear segments based on their spatial orientations and sizes.Based on the detected tree stems, downstream processing such as diameter estimation [63] and stem curve [9] retrieval can be realized.

Conclusions
In this study, we present a fully automatic and unsupervised approach to separate tree photosynthetic and non-photosynthetic components from point cloud data.This geometry-based method is free of user interventions and dispenses with manual delineation of training data, which is a tedious prerequisite for the supervised machine learning algorithms.Our method is based on a robust dynamic point cloud segmentation routine.The point cloud is firstly partitioned into meaningful segments by using the proposed DSM method.Non-photosynthetic segments such as stems and branches are then identified by estimating their linear feature saliency.The approach is tested for a single tree dataset and four plot-level datasets.These datasets cover single-scan TLS, multi-scan TLS, hand-held laser scanning, varied terrain conditions, and various tree species.The achieved accuracy reached 88.5% for the single tree dataset, and ranged from 81.8% to 92.0% for plot-level datasets.We also compared our results to a supervised machine learning method.In addition, we show that point cloud structuring enables efficient point cloud manipulation even for large datasets.Furthermore, we have discussed the performances and some challenges of separating photosynthetic and non-photosynthetic components in nature forests.
The extensive experiments on various datasets suggest that the proposed DSM method is efficient, and can be equally as effective as supervised machine learning methods.Nevertheless, the distinct advantage of our method lies in that it is unsupervised and fully automatic.Our work highlights the potential of unsupervised separation of wood and leaf points even in plot-level analysis.

Figure 1 .
Figure 1.Five datasets used in this study.(a) SD-1.(b) PD-1.(c) PD-2.(d) PD-3.(e) PD-4.The left column shows the original point clouds.Manually classified point clouds are shown in the middle column (non-photosynthetic components) and right column (photosynthetic components).These manually classified points served as validation sets in this study.

Figure 2 .
Figure 2. Distribution of reflectances of photosynthetic and non-photosynthetic components after radiometrical calibration.

Figure 3 .
Figure 3. Initial segments by over-segmentation.Each segment is randomly colored.The stem surface has large size segments because it is geometrically homogeneous, while bushes have smaller segments as they are more irregular in shape.
and std(N z{z=1,...,n } ∪ N z{z=1,...,n} ) <= 0.8 * T 0 else (3) where s and s are adjacent segments, and T is a threshold value.D n z , D q , and D d are three dissimilarity metrics.All three metrics are individually normalized to their max values, so that the similarity metric SL is within the range from 0 to 1. D n z stands for the N z dissimilarity with D n z =| N z − N z |.N z denotes the average value in the segment.D q is the size dissimilarity.It is calculated as the difference in number of points between two segments.D d represents the distance dissimilarity.To accelerate the computation, we calculate the D d as the closest distance among distances between any points in an adjacent region s to the center of the target region s (Equation (4)).

Figure 4 .
Figure 4. Identifying adjacency relations.(a) Original point cloud.(b) Overly partitioned segments.The middle segment with black color is the target segment.Its adjacent segments are founded by the adaptive search radius, and are randomly colored.(c) Superpoints representation of (b).

Figure 5 .
Figure 5. Adjacency graph of segments in Figure 4.The identified graph is initially unweighted (left), while it is further weighted by SL values defined in Equation (3) (right).Segment 13 is unqualified as in Equation (3), thus it has 0 weight to the target segment 1.For this example, segment 10 has the largest weight, meaning that it is the most similar qualified segment.Therefore, it will merge with the target segment 1 in this step.

Figure 6 .
Figure 6.An example of tree branches grouped into one segment with the Dynamic Segment Merging (DSM) method (a) and then separated with the top-shift procedure (b).

Figure 7 .
Figure 7.An example of a Heat map of acquired accuracy for the PD-2 dataset to locate proper TH l and TH n values.

Figure 9 .
Figure 9. Point cloud tile processing and merging.(a) DSM results on tiles.(b) Results after tile merging.

Figure 11 .
Figure 11.Normal vector estimation.(a) Normal vectors estimated with fixed neighborhoods, and colored by N z .(b) Optimal neighborhood size for each point.(c) Normal vectors estimated with optimal neighborhoods, and colored by N z .

Figure 12 .
Figure 12.Comparison of segmentation for the SD-1 dataset.Each segment is randomly colored.(a) Segmentation from the proposed DSM method.(b) Segmentation from the conventional region growing method.

Figure 13 .
Figure 13.A close inspection of results for the PD-3 dataset.

Figure 14 .
Figure 14.An example of our DSM method applied to detect wood points of the entire Erythrophleum fordii tree (a).Detected wood components are shown in (b), and the corresponding Quantitative Structure Model (QSM) is given in (c).

Table 1 .
1,ints that belong to different classes at the 25%, 50%, and 75% height of the bounding box of the forest plot are manually identified in the open source software CloudCompare (CloudCompare 2.9.1, 2018).Training points near the ground are additionally selected if low vegetation is present in the dataset.The RF classification was performed using the Statistics and Machine Learning Toolbox in Matlab 2017a (The MathWorks, Inc., Natick, Massachusetts, United States).Thirty-two features extracted from the point cloud.EV denotes the eigenvalue and NV is the normal vector.Three EVs are sorted in descending order.EV 1 defines the first eigenvalue, and so on.

Table 2 .
Results summary of various methods on all datasets.Bold numbers are the best results among methods.(F) denotes the results acquired with fixed neighborhood sizes, and (A) stands for adaptive neighborhood sizes.* Training and validation points have an equal split of photosynthetic and non-photosynthetic components.