Street Tree Extraction and Segmentation from Mobile LiDAR Point Clouds Based on Spatial Geometric Features of Object Primitives

: Extracting street trees from mobile Light Detection and Ranging (LiDAR) point clouds is still encountering challenges, such as low extraction accuracy and poor robustness in complex urban environment, and difﬁculty in the segmentation of overlapping trees. To solve these problems, this paper proposed a street tree extraction and segmentation method based on spatial geometric features of object primitives. In this paper, mobile LiDAR point clouds were ﬁrst segmented into object primitives based on the proposed graph segmentation method, which can release the computation burden effectively. According to the spatial geometric features of the segmented object primitives, stem points were extracted. In doing so, the robustness and accuracy for stem detecting can be improved. Furthermore, voxel connectivity analysis and individual tree optimization were combined successively. In doing so, the neighboring trees could be separated successfully. Four datasets located in Henan Polytechnic University, China, were used for validating the performance of the proposed method. The four mobile LiDAR point clouds contained 106, 45, 76, and 46 trees, respectively. The experimental results showed that the proposed method can achieve the performance of individual tree separation in all the four testing plots. Compared to the other three methods, the proposed method can make a good balance between the commission and omission errors and achieved the highest average F1 scores.


Introduction
Constructing a high-precision tree model is important for improving environmental quality and urban resident service [1][2][3]. The traditional methods of obtaining street tree information mainly include manual, Real-Time Kinematic, and total station measurements. These methods require a great deal of manpower and material resources. Moreover, the low information collecting efficiency of these measurements cannot meet the needs of the rapid extraction of street tree information [4].
LiDAR technology is a rapidly developing active remote sensing technology that can quickly obtain accurate three-dimensional (3D) point clouds on the surface of objects [5]. Although commercial photogrammetry software can generate millions of points from images in a cheaper way, this technique cannot provide additional information, such as intensity, reflected echoes, etc., which will be also useful for post-application. Thus, LiDAR technology has been widely applied in road detection, power line extraction, forest parameter estimation, 3D model reconstruction, and other fields [6,7].
According to different platforms, LiDAR systems can be divided into airborne LiDAR, terrestrial LiDAR, and mobile LiDAR [8]. The airborne LiDAR has the advantages of wide measurement range and high efficiency [9]. Therefore, Airborne Laser Scanning (ALS) data have been widely used in a wide range of trees extraction [10][11][12][13]. Yao et al. [14] jointly ii. The objects that are linear features are easily mistaken as tree stems, reducing the accuracy of street tree extraction. iii. Neighboring clustered trees are difficult to separate, which will lead to a larger individual tree extraction error.
To solve the problems mentioned above, a street tree extraction and segmentation method based on spatial geometric features of object primitives was proposed in this paper. In this paper, constraint conditions were first set to construct a graph structure to obtain the object primitives from MLS data. Then, stem points were detected according to the spatial geometric features of the segmented object primitives. After that, the street tree points were extracted by voxelizing the point clouds and conducting voxel component analysis. Finally, by comparing the shortest path from each point to the root points, the individual trees were separated.
The remainder of this paper is organized as follows. Section 2 describes the principle of the proposed method. In Section 3, the experiments are described. Section 4 discusses and analyzes the experimental results. Section 5 concludes this paper.

Methodology
The flowchart of the proposed method is shown in Figure 1. First, an improved morphological filtering method proposed by Hui et al. [39] was used to separate the ground points and non-ground points. Then, the non-ground points with higher elevations were removed. In this paper, the elevation threshold was set to 2 m. In doing so, almost all the tree stems can be retained while crown points or high building points can be removed. Subsequently, the graph structure was constructed and the object primitives were obtained by graph segmentation under the proposed constraints in this paper. After that, the spatial characteristics of the object primitives were calculated to extract the tree stem points. Then, the non-ground points were voxelized, and the initial tree points were extracted by analyzing the connectivity of adjacent voxels. To separate neighboring clustered trees successfully, this paper segmented the initial tree points through the shortest path analysis to obtain the individual trees. Four main steps were included in this paper: (i) object primitive acquisition based on graph segmentation, (ii) stems detection using object primitive characteristics, (iii) initial tree identification based on the connectivity analysis, and (iv) individual tree optimization based on voxel shortest path analysis. i.
The robustness of the street tree extraction method is poor; when encountering complex urban environment extraction, accuracy will be low. ii. The objects that are linear features are easily mistaken as tree stems, reducing the accuracy of street tree extraction. iii. Neighboring clustered trees are difficult to separate, which will lead to a larger individual tree extraction error.
To solve the problems mentioned above, a street tree extraction and segmentation method based on spatial geometric features of object primitives was proposed in this paper. In this paper, constraint conditions were first set to construct a graph structure to obtain the object primitives from MLS data. Then, stem points were detected according to the spatial geometric features of the segmented object primitives. After that, the street tree points were extracted by voxelizing the point clouds and conducting voxel component analysis. Finally, by comparing the shortest path from each point to the root points, the individual trees were separated.
The remainder of this paper is organized as follows. Section 2 describes the principle of the proposed method. In Section 3, the experiments are described. Section 4 discusses and analyzes the experimental results. Section 5 concludes this paper.

Methodology
The flowchart of the proposed method is shown in Figure 1. First, an improved morphological filtering method proposed by Hui et al. [39] was used to separate the ground points and non-ground points. Then, the non-ground points with higher elevations were removed. In this paper, the elevation threshold was set to 2 m. In doing so, almost all the tree stems can be retained while crown points or high building points can be removed. Subsequently, the graph structure was constructed and the object primitives were obtained by graph segmentation under the proposed constraints in this paper. After that, the spatial characteristics of the object primitives were calculated to extract the tree stem points. Then, the non-ground points were voxelized, and the initial tree points were extracted by analyzing the connectivity of adjacent voxels. To separate neighboring clustered trees successfully, this paper segmented the initial tree points through the shortest path analysis to obtain the individual trees. Four main steps were included in this paper: (i) object primitive acquisition based on graph segmentation, (ii) stems detection using object primitive characteristics, (iii) initial tree identification based on the connectivity analysis, and (iv) individual tree optimization based on voxel shortest path analysis.

Object Primitives Acquisition Based on Graph Segmentation
To extract tree stems effectively, this paper first acquired the initial object primitives by building a graph structure of the points. The graph is defined as Equation (1) [40]: where V represents nodes (v i ) and E represents edges (e i,j ). v i is made up by all points (p i ,i = 1, 2, . . . , N), while e i,j connects pairs of neighboring points (p i , p j ). In general, edge construction determines the results of the graph segmentation. To extract tree stem objectives effectively, this paper constructed the graph based on the proposed three constraints, as shown in Figure 2.

Object Primitives Acquisition Based on Graph Segmentation
To extract tree stems effectively, this paper first acquired the initial object primitives by building a graph structure of the points. The graph is defined as Equation (1) [40]: where V represents nodes ( i v ) and E represents edges ( ,  Figure 2.

The Constraint of Neighboring Radius
To limit the complexity of the graph, the edges within the graph should be constrained. The effective way is to construct the edges only within the neighboring points. In general, there are two means that can be used for defining the neighboring points. One is to set a fixed number of neighboring points. However, the density of MLS data is varied; if a fixed number of neighboring points is selected, then there will be longer edges within the graph. This is because the distance constraint is lacked. The graph structure tends to have long edges when the point cloud is sparse or outliers exist. The other way of defining neighboring points is to set a fixed neighboring radius. However, the threshold of the fixed neighboring radius is hard to determine. To solve this problem, this paper combined these two neighboring determination methods together. First, n points were selected from tree points randomly, and then, the k nearest neighboring points for the n points were detected. For each set of k nearest neighbors, the longest distance within the k points can be calculated. The fixed neighboring radius is calculated as the average value of these k longest distances. As shown in Figure 3a, R is the calculated fixed neighboring radius, which can be defined as Equation (2): where n is the number of randomly selected points, i j dis is the j-th neighboring distance, and k is the number of neighboring points. It must be noted that k influences the graph complexity and implementation efficiency. Considering the balance between them, k is set to 10 in this paper [41]. According to Equation (2), the neighborhood radius can be automatically determined according to the density and spatial distribution of different data.

The Constraint of Neighboring Radius
To limit the complexity of the graph, the edges within the graph should be constrained. The effective way is to construct the edges only within the neighboring points. In general, there are two means that can be used for defining the neighboring points. One is to set a fixed number of neighboring points. However, the density of MLS data is varied; if a fixed number of neighboring points is selected, then there will be longer edges within the graph. This is because the distance constraint is lacked. The graph structure tends to have long edges when the point cloud is sparse or outliers exist. The other way of defining neighboring points is to set a fixed neighboring radius. However, the threshold of the fixed neighboring radius is hard to determine. To solve this problem, this paper combined these two neighboring determination methods together. First, n points were selected from tree points randomly, and then, the k nearest neighboring points for the n points were detected. For each set of k nearest neighbors, the longest distance within the k points can be calculated. The fixed neighboring radius is calculated as the average value of these k longest distances. As shown in Figure 3a, R is the calculated fixed neighboring radius, which can be defined as Equation (2): where n is the number of randomly selected points, dis i j is the j-th neighboring distance, and k is the number of neighboring points. It must be noted that k influences the graph complexity and implementation efficiency. Considering the balance between them, k is set to 10 in this paper [41]. According to Equation (2), the neighborhood radius can be automatically determined according to the density and spatial distribution of different data.

The Constraint of the Angle between Normal Vectors
In general, the object primitives of tree stems have a normal vector consistency. That is, the normal vector angle between adjacent points is small. Therefore, this feature can be used to separate these non-ground objects into the same object primitives. On the contrary, the points of crowns and low vegetation are scattered and do not have the consistency of the normal vector. As a result, these objects will be divided into small, scattered object primitives. Thus, to extract the tree stem correctly, the points within the same object primitive should have similar normal vectors. In this paper, the angle (θ) of normal vectors between each point and its neighboring points is calculated to obtain the object primitives with similar spatial characteristics, as shown in Figure 3b. If θ is less than the threshold, the edge between the two points is preserved. In this paper, the threshold was set to 30 • by

The Constraint of the Angle between Normal Vectors
In general, the object primitives of tree stems have a normal vector consistency. That is, the normal vector angle between adjacent points is small. Therefore, this feature can be used to separate these non-ground objects into the same object primitives. On the contrary, the points of crowns and low vegetation are scattered and do not have the consistency of the normal vector. As a result, these objects will be divided into small, scattered object primitives. Thus, to extract the tree stem correctly, the points within the same object primitive should have similar normal vectors. In this paper, the angle ( θ ) of normal vectors between each point and its neighboring points is calculated to obtain the object primitives with similar spatial characteristics, as shown in Figure 3b. If θ is less than the threshold, the edge between the two points is preserved. In this paper, the threshold was set to 30° by trial and error. Based on this threshold, stem points can be better extracted. θ is defined as Equation (3).
where 1 v and 2 v are the normal vectors of two adjacent nodes.

The Constraint of the Vertical angle Changing Rate
In general, stem points have regular vertical distribution, while other points, such as crown points, are usually in random scattered distribution. In other words, the angles between the normal vectors of crown points and the Z direction generally change greatly than that of stem points. Thus, the third constraint was set as the vertical angle changing rate in this paper, which is defined as Equation (4): where β  i is the vertical angle difference between the point and its neighboring points; as shown in Figure 3c, num is the number of neighboring points. As shown in Figure 2, the points with similar vertical angle changing rates can be divided into the same object primitives, such as tree stems and buildings. Meanwhile, the points with larger vertical angle changing rates, such as tree canopies, were divided into several smaller object primitives.
where v 1 and v 2 are the normal vectors of two adjacent nodes.

The Constraint of the Vertical Angle Changing Rate
In general, stem points have regular vertical distribution, while other points, such as crown points, are usually in random scattered distribution. In other words, the angles between the normal vectors of crown points and the Z direction generally change greatly than that of stem points. Thus, the third constraint was set as the vertical angle changing rate in this paper, which is defined as Equation (4): where β i is the vertical angle difference between the point and its neighboring points; as shown in Figure 3c, num is the number of neighboring points. As shown in Figure 2, the points with similar vertical angle changing rates can be divided into the same object primitives, such as tree stems and buildings. Meanwhile, the points with larger vertical angle changing rates, such as tree canopies, were divided into several smaller object primitives.

Stems Detection Using Geometric Characteristics of Object Primitives
After the graph segmentation, the point clouds will be divided into multiple object primitives. The stem primitives usually present independent cylindrical shapes. To extract stem points accurately, this paper calculated the geometric features of each object primitive, as shown in Figure 4a. Figure 4b shows the calculation of linearity. It can be found that most stem primitives own large linearity, such as the yellow points shown in Figure 4c. The linearity of object primitives is defined as Equation (5): where λ is the eigenvalue of the covariance matrix of the primitive. Here, λ 1 > λ 2 > λ 3 > 0. From Figure 4b, it can be found that, compared with crown points, λ 1 of the stem and branch is significantly larger than λ 2 . Thus, the linearities of stem and branch points are generally larger than the ones of crown points.
Forests 2022, 13, 1245 where λ is the eigenvalue of the covariance matrix of the primitive. Here, λ λ λ > > > 1 2 3 0 . From Figure 4b, it can be found that, compared with crown points, λ 1 of the stem and branch is significantly larger than λ 2 . Thus, the linearities of stem and branch points are generally larger than the ones of crown points. Since the objects, such as buildings and low vegetation, may also have high linearities, another spatial feature was adopted in this paper to detect tree stems. This paper defined it as the Height-Width (H-W) ratio of the object primitives. In this paper, the H-W ratio refers to the ratio of the height and width of the minimum bounding box of each object, as shown in Figure 4d. As the stems are in cylindrical shapes, the H-W ratio of stem primitives is larger than that of other primitives. Therefore, the stem points can be further optimized by detecting the primitives with a larger H-W ratio. The H-W ratio is defined as Equation (6): Where obj h is calculated as the elevation difference between the highest and lowest points within the object. δ x and δ y are the ranges of points within the object primitive on the X and Y axes, respectively. Based on linearity and the H-W ratio, the stems can be detected as shown in Figure 4f. Since the objects, such as buildings and low vegetation, may also have high linearities, another spatial feature was adopted in this paper to detect tree stems. This paper defined it as the Height-Width (H-W) ratio of the object primitives. In this paper, the H-W ratio refers to the ratio of the height and width of the minimum bounding box of each object, as shown in Figure 4d. As the stems are in cylindrical shapes, the H-W ratio of stem primitives is larger than that of other primitives. Therefore, the stem points can be further optimized by detecting the primitives with a larger H-W ratio. The H-W ratio is defined as Equation (6): Where h obj is calculated as the elevation difference between the highest and lowest points within the object. δ x and δ y are the ranges of points within the object primitive on the X and Y axes, respectively. Based on linearity and the H-W ratio, the stems can be detected as shown in Figure 4f.

Initial Tree Points Identification Based on Voxel Connectivity Analysis
After tree stem extraction, initial tree points were acquired based on voxel connectivity analysis. As shown in Figure 5a, stems (red points) were detected by the method described in Section 2.2. Subsequently, all the points were voxelized as a series of voxels. To obtain initial tree points, the voxel connectivity analysis was applied to the voxelization result. The process of voxel connectivity analysis is shown in Figure 6 [42]. Figure 6a shows the voxels before segmentation. The voxels with no point are labeled as 0, while the voxels containing points are labeled as 1. Figure 6b is the six-connected model used in this paper. This means only the voxels distributed as the six-connected model can be seen as connectivity. In the step of voxel connectivity analysis, the voxels of stems were selected as the seed voxels. Then, the neighboring voxels were searched, and the connectivity to the seed voxels was judged. Only the adjacent voxels with common faces are regarded as having connectivity, as mentioned above. Subsequently, the connected result can be obtained as shown in Figure 6c. stems detection, some elongated objects, such as lamp posts, telephone poles, etc., will also be misclassified as tree stems. This is because these object primitives also own a larger linearity and H-W ratio simultaneously. To avoid the influence of these wrongly detected stems, this paper removed the objects with a smaller number of voxels after the voxel connectivity analysis. This is because the size of the wrongly detected trees, such as lamp posts, are generally smaller. Thus, by limiting the number of voxels for each object, the wrongly detected trees can be removed. tree points can be obtained as shown in Figure 5d. It should be noted that, in the step of stems detection, some elongated objects, such as lamp posts, telephone poles, etc., will also be misclassified as tree stems. This is because these object primitives also own a larger linearity and H-W ratio simultaneously. To avoid the influence of these wrongly detected stems, this paper removed the objects with a smaller number of voxels after the voxel connectivity analysis. This is because the size of the wrongly detected trees, such as lamp posts, are generally smaller. Thus, by limiting the number of voxels for each object, the wrongly detected trees can be removed. Using the proposed voxel connectivity analysis, the voxels connected to stems can be merged together, as shown in Figure 5c. According to the connected results, the whole tree points can be obtained as shown in Figure 5d. It should be noted that, in the step of stems detection, some elongated objects, such as lamp posts, telephone poles, etc., will also be misclassified as tree stems. This is because these object primitives also own a larger linearity and H-W ratio simultaneously. To avoid the influence of these wrongly detected stems, this paper removed the objects with a smaller number of voxels after the voxel connectivity analysis. This is because the size of the wrongly detected trees, such as lamp posts, are generally smaller. Thus, by limiting the number of voxels for each object, the wrongly detected trees can be removed.

Individual Trees Optimization Based on Voxel Shortest Path Analysis
From Figure 5, it can be found that, after the voxel connectivity analysis, tree points can be discriminated from other points based on the detected tree stems. However, there are some neighboring clustered trees that are connected as one tree, as shown in Figure 5d. This is because the distance between the two neighboring trees is close; when conducting voxel connectivity analysis, it will be easy to connect the neighboring trees as one tree. Thus, further optimization should be conducted to obtain optimal individual tree separation results.
The flowchart of individual tree optimization is shown in Figure 7. In this method, each tree primitive was traversed one by one. The tree primitives are the tree point extraction results after the voxel connectivity analysis. Obviously, there will be one or more trees contained in each tree primitive. To determine this, the stems within each tree primitive are counted. During the traversing, if more than one stem is within the tree, then the tree primitive is under-segmented and should be further optimized. In the optimization, the under-segmented tree primitive is first voxelized, and then, the center of each voxel forms the built graph. In doing so, the calculation burden will be released, and the efficiency will be improved. Since the under-segmented tree primitive contains more than one stem, each point within the tree primitive will have a shortest path (SP) to a different base of stems, according to the Dijkstra algorithm shown in Figure 8. Obviously, the points should belong to the stems with shorter SPs. The Dijkstra algorithm is a famous SP calculation method [43]. In the Dijkstra algorithm, the direct path from the source node to the end nodes is first calculated. Then, the shortest path among them can be calculated, which will be used for adjusting the other paths. Within the adjusted paths, the shortest one can be found. This process is iterated until all the SPs from the source node to the other end nodes are detected. By comparing the SPs, the neighboring trees can be optimized as individual trees.
5d. This is because the distance between the two neighboring trees is close; when co ducting voxel connectivity analysis, it will be easy to connect the neighboring trees as o tree. Thus, further optimization should be conducted to obtain optimal individual t separation results.
The flowchart of individual tree optimization is shown in Figure 7. In this meth each tree primitive was traversed one by one. The tree primitives are the tree point traction results after the voxel connectivity analysis. Obviously, there will be one or m trees contained in each tree primitive. To determine this, the stems within each t primitive are counted. During the traversing, if more than one stem is within the tr then the tree primitive is under-segmented and should be further optimized. In the o timization, the under-segmented tree primitive is first voxelized, and then, the center each voxel forms the built graph. In doing so, the calculation burden will be released, a the efficiency will be improved. Since the under-segmented tree primitive contains m than one stem, each point within the tree primitive will have a shortest path (SP) t different base of stems, according to the Dijkstra algorithm shown in Figure 8. Obvious the points should belong to the stems with shorter SPs. The Dijkstra algorithm is a mous SP calculation method [43]. In the Dijkstra algorithm, the direct path from source node to the end nodes is first calculated. Then, the shortest path among them c be calculated, which will be used for adjusting the other paths. Within the adjusted pat the shortest one can be found. This process is iterated until all the SPs from the sou node to the other end nodes are detected. By comparing the SPs, the neighboring tr can be optimized as individual trees.

Experimental Datasets
To evaluate the performance of the proposed method, four datasets located in the Henan Polytechnic University, China, were used for testing [44]. The datasets were obtained by SSW-2 MLS along road scenes in the campus. The acquired point clouds in-

Experimental Datasets
To evaluate the performance of the proposed method, four datasets located in the Henan Polytechnic University, China, were used for testing [44]. The datasets were obtained by SSW-2 MLS along road scenes in the campus. The acquired point clouds include street trees, buildings, street lamps, cars, pedestrians, etc. As shown in Figure 9, the street trees in four areas have different distribution characteristics. In Sample 1 (Figure 9a) and Sample 4 (Figure 9d), there are a large number of street trees with different heights and overlapping crowns. In Sample 2 ( Figure 9b) and Sample 3 (Figure 9c), most trees are regularly distributed and have similar heights. In addition, along the street trees of the four datasets there are many different objects, such as buildings, cars, and road lamps. The characteristics of the four datasets are tabulated in Table 1. From Table 1, it can be found that the datasets can test the extraction performance of the proposed method for street trees with different distribution characteristics to verify the effectiveness and robustness of the method.

Experimental Datasets
To evaluate the performance of the proposed method, four datasets located in the Henan Polytechnic University, China, were used for testing [44]. The datasets were obtained by SSW-2 MLS along road scenes in the campus. The acquired point clouds include street trees, buildings, street lamps, cars, pedestrians, etc. As shown in Figure 9, the street trees in four areas have different distribution characteristics. In Sample 1 ( Figure  9a) and Sample 4 (Figure 9d), there are a large number of street trees with different heights and overlapping crowns. In Sample 2 ( Figure 9b) and Sample 3 (Figure 9c), most trees are regularly distributed and have similar heights. In addition, along the street trees of the four datasets there are many different objects, such as buildings, cars, and road lamps. The characteristics of the four datasets are tabulated in Table 1. From Table 1, it can be found that the datasets can test the extraction performance of the proposed method for street trees with different distribution characteristics to verify the effectiveness and robustness of the method.

Experimental Results
The proposed method was implemented in a MATLAB 2020a environment on a laptop computer with an Intel ® Core™ i7-9750H CPU, 16.0 GB of RAM, and a Windows 10 64-bit operating system. The final outcome of the proposed method is the separated individual trees. Thus, to evaluate the performance of the proposed method, the referenced individual trees should be obtained. In this paper, the referenced individual trees were acquired manually using a visual software named CloudCompare [45]. Specially, tree tops of the segmented individual trees were detected manually, which serve as the reference results. In this paper, the performance of individual tree separation was accessed based on treetops matching, which was developed by Eysn et al. [46].
The process of treetop matching algorithm is shown in Figure 10. This matching method firstly detected the candidates from the highest tree tops. The tree tops within a radius of the reference tree tops were judged as the candidates. In this paper, the radius was set to 5 m. After that, the candidates were ranked depending on their height difference and the two-dimensional (2D) distance between the candidates and the reference tree tops. Then, the candidate tree tops with a large height difference were removed. If a candidate tree top showed a better height difference and its 2D distance difference from the nearest candidate was less than 2.5 m, it was determined as the matching point of the reference tree top.

Experimental Results
The proposed method was implemented in a MATLAB 2020a environment on a laptop computer with an Intel ® Core™ i7-9750H CPU, 16.0 GB of RAM, and a Windows 10 64-bit operating system. The final outcome of the proposed method is the separated individual trees. Thus, to evaluate the performance of the proposed method, the referenced individual trees should be obtained. In this paper, the referenced individual trees were acquired manually using a visual software named CloudCompare [45]. Specially, tree tops of the segmented individual trees were detected manually, which serve as the reference results. In this paper, the performance of individual tree separation was accessed based on treetops matching, which was developed by Eysn et al. [46] The process of treetop matching algorithm is shown in Figure 10. This matching method firstly detected the candidates from the highest tree tops. The tree tops within a radius of the reference tree tops were judged as the candidates. In this paper, the radius was set to 5 m. After that, the candidates were ranked depending on their height difference and the two-dimensional (2D) distance between the candidates and the reference tree tops. Then, the candidate tree tops with a large height difference were removed. If a candidate tree top showed a better height difference and its 2D distance difference from the nearest candidate was less than 2.5 m, it was determined as the matching point of the reference tree top. Figure 10. The treetop matching method. The treetops within a radius of the reference tree tops were judged as the candidates. Voting candidates with the height difference and 2D distance between the candidates and the reference tree tops. If a candidate tree top showed a better height difference and its 2D distance difference from the nearest candidate was less than 2.5 m, it was determined as the matching tree top. The numbers above the points indicate the elevation values. Figure 10. The treetop matching method. The treetops within a radius of the reference tree tops were judged as the candidates. Voting candidates with the height difference and 2D distance between the candidates and the reference tree tops. If a candidate tree top showed a better height difference and its 2D distance difference from the nearest candidate was less than 2.5 m, it was determined as the matching tree top. The numbers above the points indicate the elevation values.
Five indicators were adopted to evaluate the precision of the individual tree segmentation results. They are extraction rate, matching rate, commission error, omission error, and F1 score. The above five indicators are calculated as Equations (7)- (11): Matching rate = Num_match Num_re f (8) Omission error = Num_re f − Num_match Num_re f (10) where Num_test is the number of detected tree tops in the segmentation result, i.e., the total number of the detected individual trees, Num_re f represents the tree tops in the reference data, i.e., the accurate number of individual trees, Num_match is the number of detected tree tops matching the reference data, i.e., the number of correctly detected trees. The extraction rate represents the percentage of the detected trees to the total number of reference trees. The matching rate indicates how well the detected trees match the reference. Omission error is the percentage of trees that are not correctly segmented. Commission error tends to represent the over-segmentation appearance of the method. F1 score is a comprehensive indicator that reflect the effectiveness of the method. Table 2 tabulates the accuracy calculation results of the proposed method. From Table 2, it can be found that the extraction rates in the four datasets were all close to 1. This means that the proposed method did not tend to over-segment more trees or wrongfully detect less trees. The average matching rate is larger than 0.85. Thus, it can be concluded that the proposed method can achieve a good extraction result compared with referenced individual trees. The commission and omission errors in Sample 4 are a little larger. This is because there are many weeping willows in Sample 4. Some branches of the weeping willows are attached to the ground, which will easily be misclassified as stems. Thus, the detection errors in this area are larger, while the F1 score is a little smaller. However, the F1 scores of the other three areas are also larger than 0.8. As a result, the average F1 score for the four areas is still larger than 0.8. This indicates that the proposed method can achieve good individual tree detection performance.

Discussion and Analysis
In this paper, three main steps were involved, including stem detection, tree point identification, and individual tree optimization. To further analyze each step in detail, this paper will discuss these three steps, respectively.
In the process of stem detection, two spatial geometric features of object primitives were involved, namely linearity (linearity obj ) and the H-W ratio (ratio obj ). The linearity reflects the similarity between the overall shape of each object primitive and the stems. The higher the value, the more likely the object primitive is to be a stem. Figure 11a shows the object primitives extracted only based on linearity. The linearity values of reserved object primitives are all higher than 0.9. It can be seen that most of the stems were extracted correctly, but there were still some other objects (green and yellow points) wrongly reserved. These are the facades of buildings whose linearity values meet the threshold. Therefore, when the linearity of the object primitives is only constrained, the vertical distributed stems and some other artificial objects extending horizontally or slanting are retained at the same time. Thus, the correctness of street tree extraction is low.
shows the object primitives extracted only based on linearity. The linearity values of re-served object primitives are all higher than 0.9. It can be seen that most of the stems were extracted correctly, but there were still some other objects (green and yellow points) wrongly reserved. These are the facades of buildings whose linearity values meet the threshold. Therefore, when the linearity of the object primitives is only constrained, the vertical distributed stems and some other artificial objects extending horizontally or slanting are retained at the same time. Thus, the correctness of street tree extraction is low. The H-W radio reflects the vertical distribution characteristics of object primitivity. For example, the H-W radio of the stem object primitive is usually higher than that of other object primitives. This is because the stem is a slender cylinder that grows vertically, and the height of the cylinder is significantly greater than its bottom diameter. Figure 11b is the object primitives extracted only based on the H-W ratio. From Figure  11b, some object primitives with fewer points were wrongly extracted (red points in Figure 11b). This is because these object primitives contain a few points, resulting in the widths of these object primitives being much smaller than the corresponding heights. As shown in Figure 11a,b, it can be concluded that the extracted stem results cannot be good under the constraint of linearity or H-W ratio separately. The object primitives wrongly extracted in Figure 11a have a smaller H-W ratio, while the object primitives wrongly extracted in Figure 11b have smaller linearity. Consequently, when the above two spatial geometric feature constraints are adopted simultaneously, the satisfying stem extraction results can be obtained, as shown in Figure 11c.
The second main step of the proposed method is tree point identification. The result of tree point identification directly affects the final individual tree separation outcome. In The H-W radio reflects the vertical distribution characteristics of object primitivity. For example, the H-W radio of the stem object primitive is usually higher than that of other object primitives. This is because the stem is a slender cylinder that grows vertically, and the height of the cylinder is significantly greater than its bottom diameter. Figure 11b is the object primitives extracted only based on the H-W ratio. From Figure 11b, some object primitives with fewer points were wrongly extracted (red points in Figure 11b). This is because these object primitives contain a few points, resulting in the widths of these object primitives being much smaller than the corresponding heights. As shown in Figure 11a,b, it can be concluded that the extracted stem results cannot be good under the constraint of linearity or H-W ratio separately. The object primitives wrongly extracted in Figure 11a  have a smaller H-W ratio, while the object primitives wrongly extracted in Figure 11b have smaller linearity. Consequently, when the above two spatial geometric feature constraints are adopted simultaneously, the satisfying stem extraction results can be obtained, as shown in Figure 11c.
The second main step of the proposed method is tree point identification. The result of tree point identification directly affects the final individual tree separation outcome. In this paper, tree points were identified based on a voxel connectivity analysis. After tree point identification, point clouds in the scene can be classified as two categories, namely tree points and non-tree points. Thus, Type I, Type II, and total errors can be calculated to evaluate the performance of the proposed method. Here, Type I error means the percent of tree points misclassified as non-tree points, while Type II error represents the percent of non-tree points misclassified as tree points. Total error is the percent of points wrongly classified. The accuracy calculation results are tabulated in Table 3. From Table 3, it can be found that the errors of these three types are all smaller, except for the type II error of Sample 3. This is because there are several trees that are attached with adjacent walls in Sample 3. When applying voxel connectivity analysis to the detected stems, these wall points will be misclassified as tree points. As a result, its type II error and total error are a little larger. When calculating the average values of these three types of errors, it can be found that all the average errors are smaller than 5%. Thus, it can be concluded that the proposed method can achieve a good tree point identification result. After the tree points are detected, individual tree segmentation can be carried out. To evaluate the performance of the proposed segmentation method objectively, three other individual tree extraction methods were selected for comparative analysis. Chen et al. [47] proposed a marker-controlled watershed segmentation method (Watershed). In their method, the canopy maxima model was first established, and the variable size windows were used to detect treetops in it. After that, the marker-controlled watershed segmentation method was adopted to obtain individual trees. Wang [48] proposed an unsupervised method based on superpoint graph structure (SSSC). The point cloud was recursively segmented to obtain the object primitives, and the superpoint of each object primitive was extracted. Then, an undirected superpoint graph was constructed to calculate the eigenvectors of the superpoints. Finally, the object primitives represented by the superpoint were clustered based on the eigenvectors and shortest path analysis to achieve individual tree extraction. Latella et al. [49] proposed a density-based algorithm (ITDM) to detect individual trees. They first removed the points with low elevation, such as shrub and grassland points, to reduce the calculation cost. The point cloud was then projected onto a horizontal plane, and its projection density was calculated to extract locally dense points as stem points. Finally, the height maximum points around the stem location were extracted as the treetops. Figures 12-15 show the accuracy comparison in four plots of the proposed method with the three methods mentioned above. It can be seen that the satisfying results in four plots were obtained by the proposed method. The F1 scores of the proposed method in four plots are all higher than those of other three methods, and the F1 scores of three plots are greater than 0.8. This shows that the proposed method has higher precision and stronger robustness compared to the other three methods. In the four plots, the proposed method achieves the best on three of five indicators. This indicates that the overall performance of the proposed method is the best compared with the other three methods. As shown in Figure 12, the proposed method can achieve the lowest commission rate and the best extraction rate. This shows that the proposed method performs well in the street trees with regular distribution and different heights. As shown in Figures 13 and 15, the proposed method has a low matching rate; however, its extraction rate is the best. It can be concluded that the proposed method can reduce the over-segmentation as much as possible, while ensuring the best segmentation. As shown in Figure 15, all methods have a higher commission error, which is related to the low detection correctness of the street trees. This is because there are lots of weeping willows in Sample 4. These elongated willow branches attached to the ground are often mistaken as stems. As a result, the number of wrongly extracted street trees is increased, leading to larger commission errors. Compared with the other three plots, the proposed method performed best in Sample 3, as shown Figure 14. All indicators of Sample 3 are the best out of the other three methods. This shows that the proposed method has a good detection and segmentation ability on the street trees with regular distribution.      Table 4, it can be found that the extraction rate of the proposed method is the smallest one compared to the other three methods. The extraction rate of the Watershed method is about two times of that of the proposed method. This indicates that the proposed method will not try to over-segment trees. Although the average matching rate of the proposed method is slightly lower than that of the ITDM method, the average extraction rate and commission error of the proposed method are the best compared to the other three methods. As a result, the average F1 score of the proposed method is much higher than the other three methods. This shows that the proposed method can achieve good individual tree segmentation results in four plots.    Table 4. shows the comparison of the average accuracy of the four methods. From Table 4, it can be found that the extraction rate of the proposed method is the smallest one compared to the other three methods. The extraction rate of the Watershed method is about two times of that of the proposed method. This indicates that the proposed method will not try to over-segment trees. Although the average matching rate of the proposed method is slightly lower than that of the ITDM method, the average extraction rate and commission error of the proposed method are the best compared to the other three methods. As a result, the average F1 score of the proposed method is much higher than the other three methods. This shows that the proposed method can achieve good individual tree segmentation results in four plots.  Table 4 shows the comparison of the average accuracy of the four methods. From Table 4, it can be found that the extraction rate of the proposed method is the smallest one compared to the other three methods. The extraction rate of the Watershed method is about two times of that of the proposed method. This indicates that the proposed method will not try to over-segment trees. Although the average matching rate of the proposed method is slightly lower than that of the ITDM method, the average extraction rate and commission error of the proposed method are the best compared to the other three methods. As a result, the average F1 score of the proposed method is much higher than the other three methods. This shows that the proposed method can achieve good individual tree segmentation results in four plots.

Conclusions
Street tree extraction is a significant step in the construction of a digital twin city. In this paper, mobile LiDAR point clouds were first segmented into object primitives based on the proposed graph segmentation under certain constraints. In doing so, the computation burden can be released. To improve the accuracy and robustness of stem detection, both geometric features of linearity and H-W ratio were combined. To most traditional individual tree detection methods, separating the neighboring trees is still a challenge. In this paper, voxel shortest path analysis was proposed to achieve optimized individual tree separation results. The proposed method was validated using four datasets. The experimental results showed that the proposed method can achieve good street tree extraction and segmentation performance in all four testing plots. Compared with the other three methods, the tree segmentation method in this paper achieved the highest average F1 score. This reveals that the proposed method can balance the commission rate and omission rate, effectively avoiding over-segmentation. However, the implementation of the proposed method requires setting some parameters, such as the thresholds of linearity and H-W ratio of the object primitives. How to improve the automation of the method, making the parameters determined automatically according to the particularity of each point cloud, will be focused on in our further research.