Individual Tree Crown Delineation from UAS Imagery Based on Region Growing and Growth Space Considerations

The development of unmanned aerial systems (UAS) equipped with various sensors (e.g., Lidar, multispectral sensors, and/or cameras) has provided the capability to “see” the individual trees in a forest. Individual tree crowns (ITCs) are the building blocks of precision forestry, because this knowledge allows users to analyze, model and manage the forest at the individual tree level by combing multiple data sources (e.g., remote sensing data and field surveys). Trees in the forest compete with other vegetation, especially neighboring trees, for limited resources to grow into the available horizontal and vertical space. Based on this assumption, this research developed a new region growing method that began with treetops as the initial seeds, and then segmented the ITCs, considering its growth space between the tree and its neighbors. The growth space was allocated by Euclidian distance and adjusted based on the crown size. Results showed that the over-segmentation accuracy (Oa), under-segmentation (Ua), and quality rate (QR) reached 0.784, 0.766, and 0.382, respectively, if the treetops were detected from a variable window filter based on an allometric equation for crown width. The Oa, Ua, and QR increased to 0.811, 0.853, and 0.296, respectively, when the treetops were manually adjusted. Treetop detection accuracy has a great impact on ITCs delineation accuracy. The uncertainties and limitations within this research including the interpretation error and accuracy measures were also analyzed and discussed, and a unified framework assessing the segmentation accuracy was highly suggested.


Introduction
In recent years, the rapid development of unmanned aerial systems (UAS) equipped with various sensors such as digital cameras, and multispectral, hyperspectral, or Lidar sensors has provided 3D data and/or higher spatial resolution images with greater flexibility and reliability, and for a much lower cost compared to traditional platforms such as manned aircraft or satellites [1][2][3]. These technologies have provided the capability to clearly "see" the individual trees in a forest, which has brought opportunities for precision forestry applications such as disease mapping, invasive species mapping, and fire monitoring [4]. Individual tree crowns (ITCs) are the building blocks of precision forestry, resulting in a bridge to connect multi-dimensional measurements of trees (e.g., size, crown shape) from multiple data sources (e.g., remote sensing data and field surveys) to study the forest at an individual tree level [1,5]. ITCs also enable foresters to perform better forest management and field inventory, such as selective cuts, silviculture treatments, or biodiversity assessments [6].
The ITC delineation procedure treats each tree crown as a single object, with a separate boundary from the background or other vegetation [6]. Image segmentation, the process of grouping Remote Sens. 2020, 12, 2363 3 of 15 Figure 1 summarizes the workflow of the improved region growing method proposed in this research. First, treetops were detected with a local window filter, which varied based on an allometric equation for crown width (Figure 1a). The detected treetops were regarded as region growing seeds. Second, the growth space of a tree was modeled by an adjusted Euclidean allocation, which considered the relative tree crown size and distance to its neighboring trees (Figure 1b). The growth space of a tree was then divided into eight smaller sections in terms of Euclidean directions (Figure 1c). Region growing was performed within each section by combining the spectral and spatial distance to the treetop. Steps 2 to 3 were performed iteratively, until all detected trees were segmented. Finally, a segmentation accuracy assessment was implemented by comparing the segmented ITCs to reference crowns chosen by random sampling (Figure 1d).

Workflow
Remote Sens. 2020, 12, x FOR PEER REVIEW 3 of 15 respectively, in which both the strengths and limitations of this method were analyzed. The major conclusions are highlighted in the last section. Figure 1 summarizes the workflow of the improved region growing method proposed in this research. First, treetops were detected with a local window filter, which varied based on an allometric equation for crown width (Figure 1a). The detected treetops were regarded as region growing seeds. Second, the growth space of a tree was modeled by an adjusted Euclidean allocation, which considered the relative tree crown size and distance to its neighboring trees (Figure 1b). The growth space of a tree was then divided into eight smaller sections in terms of Euclidean directions ( Figure  1c). Region growing was performed within each section by combining the spectral and spatial distance to the treetop. Steps 2 to 3 were performed iteratively, until all detected trees were segmented. Finally, a segmentation accuracy assessment was implemented by comparing the segmented ITCs to reference crowns chosen by random sampling (Figure 1d).

Individual Treetop Detection
Individual treetops were detected by running a variable window filter algorithm, developed by Popescu and Wynne [31], on a CHM. To start, each pixel in the CHM with a height greater than 5 m was initially treated as a potential treetop. Previous land cover mapping efforts defined forests as any land with trees greater than 5 m [32,33]. Therefore, pixels with a height less than 5 m were masked out. Tree crowns were assumed to be circular; thus, a local circular window filter centered at each pixel (x, y) was generated. The size of the filter was determined by means of an allometric equation that estimates crown width (CW) from tree height (Equation (1)) [31,34].
In Equation (1), H represents the height in meters at (x, y). The estimated crown width (CW) at pixel (x, y) was transformed into a window size, based on the spatial resolution of the CHM. The allometric equation (Equation (1)) was derived by Popescu and Wynne [31] from 424 deciduous and coniferous trees. This equation is considered representative of the eastern United States.
The filtering algorithm assumes the treetop is the highest point within the crown. A pixel (x, y) is chosen as a treetop only if it is the highest within the local filter. Each detected tree has a unique ID ranging from 1 to N where N is the total number of trees. Figure 1a shows an example of treetops detected by this algorithm.

Adjusted Euclidean Allocation
A tree's growth space was modeled based on its crown size and distance to its neighboring trees (Figure 1b) using an adjusted Euclidean distance (AED), defined here as the Euclidean distance multiplied by the crown width ((Equation (2)). A pixel (x, y) was assigned to the j th tree's allocation only if it had the minimum AED to this tree.
In Equation (2), (x, y) represents any pixel whose allocation needs to be determined and (x j , y j ) indicates the j th detected tree. Additionally, CW j is the crown width of the j th tree.

Euclidean Direction and Region Growing
Each tree's growth space was further divided into eight, 45 • wide quadrants, where the north quadrant ranged from −22.5 • to 22.5 • , and so on ( Figure 1c). The region growing was performed within each quadrant of the growth space. The algorithm used the detected treetops as the initial seeds and calculated the similarity between the treetop and a candidate crown pixel as the spectral distance (SD) between said pixels (Equation (3)) multiplied by 1 minus a distance decay function ( f (d)) (Equation (4)). Any candidate pixel (x, y) located within a growth space quadrant was included as part of this tree's crown if the similarity was less than a global threshold (θ) (Equation (5)).
In Equation (3), SD is the spectral distance between a treetop and a candidate pixel, where S represents the spectral vector of a candidate pixel and T is the transpose of the vector. All bands including R, G, and B were used to calculate the spectral distance. The detected treetops (initial seeds) may hit noise pixels (e.g., branches), and, therefore, the average spectral vector (µ κ ) of each Remote Sens. 2020, 12, 2363 5 of 15 seed's neighboring 5 × 5 pixels was used to smooth the spectra. The f (d) in Equation (4) denotes the exponential distance decay function where d represents the candidate pixel's Euclidean distance to the seed and h is defined as the maximum distance between the seed and Euclidean boundary within that growth space quadrant. As d increases, 1 − f (d) increases, which means candidate pixels further from the seed (i.e., treetop), but with a low spectral distance (SD), would be included as part of the crown. This improved region growing algorithm used 4-connectivity to describe its neighboring pixels. It is worth noting that any candidate pixel outside the growth space quadrant can still be included as part of a tree, which means the distance (d) can be greater than h, but it should have a lower SD.
Note that as the candidate pixel moves further away from the seed, (1 − f (d)) increases and spectral distance (SD) potentially increases, because it may hit another spectrally different tree crown. Therefore, this algorithm automatically converges towards a final segmentation. However, like other threshold-based approaches [22,23], the segmentation results will vary based on the chosen global threshold. To determine the best threshold for segmentation, the global threshold (θ) values between 10 to 20 were tested. The threshold was systematically increased by 1 after each segmentation. The chosen segmentation is the one that resulted in the highest accuracy based on a comparison to reference tree crowns (Section 2.6.2) It is also noteworthy that if the inequation (Equation (5)) divides (1 − f (d)) by both sides, it becomes a dynamic local threshold (θ/(1 − f (d)) defined for SD within each growth space quadrant.

Hole Filling
The segmented result may have holes within an ITC because of, for example, the branches or background pixels that were not included as part of the crown. The morphological flood-fill operation was further applied to fill holes within each tree crown. The flood-fill process was based on dilation, complementation, and intersection [35,36].

Individual Tree Detection Accuracy Assessment
The individual tree detection (ITD) accuracy assessment was performed by comparing a sample of the detected trees using the proposed method with reference trees that were manually interpreted. The results were presented as an error matrix (Table 1). In the error matrix, TP (true positive) represents the number of trees that are correctly detected. FP (false positive) represents the number of detected trees that were not matched with a reference tree, also known as commission error. FN (false negative) indicates the number of undetected reference trees, also known as omission error. TN (true negative) occurs where there was no tree detected and there was also no reference tree. TN is not necessary for calculating the accuracy measures. The recall (r), precision (p), and F-score (F) were estimated from the error matrix values [37].

Segmentation Accuracy Assessment
A total of n segmented tree crowns whose treetops were detected correctly were randomly selected for a segmentation accuracy assessment. The corresponding reference tree crowns were manually interpreted and digitized by combining a natural color image with the CHM data. A segmented tree crown has automatically built a "one-to-one" relationship with its reference tree crown because both contain the same treetop. The segmentation accuracy was reported using over-segmentation accuracy (Oa), under-segmentation accuracy (Ua), and quality rate (QR) (Equations (6)-(8) respectively) [38,39].
In Equations (6)-(8) the symbol ∩ represents the intersection of the reference and detected crown polygons, while the ∪ denotes their union. The r i represents i th reference crown, while s i indicates the segmented tree crown corresponding to r i . n represents the sampling size.
Both Oa and Ua are based on the intersection of the segmented tree crown with its reference crown. QR combines the intersection and union regions and considers the geometrical similarity [40]. A higher Oa or Ua indicates greater segmentation accuracy, while a higher QR indicates lower accuracy [39].

Experiments and Analysis
Intuitively, the treetop detection error would impact the segmentation accuracy. Therefore, two comparative experiments were conducted to examine how treetop detection error propagates through the ITC delineation. Experiment #1 followed the entire workflow described in Sections 2.2-2.6. However, for experiment #2, the results of treetop detection (Section 2.2) were manually adjusted to significantly improve the detection accuracy prior to performing the rest of the methods as described.

Experiment #1
The study site of experiment #1 was located in the College Woods Natural Area (CWNA, Figure 2) in Durham, NH, U.S.A. CWNA is owned and managed by the University of New Hampshire. The central longitude and latitude of CWNA are 70 • 56'51.339"W and 43 • 8'7.935"N, respectively. CWNA is a mixed forest dominated by white pine (Pinus strobus), eastern hemlock (Tsuga canadensis), American beech (Fagus grandifolia), and oaks (Quercus spp). The study site is a spatial subset of the CWNA with a coverage area of 400 × 400 m, of which nearly 60% is occupied by coniferous species.
A total of 1961 raw images were collected on July 11th, 2018, with a senseFly eBee Plus carrying a S.O.D.A. (Sensor Optimized for Drone Application) camera, which captures natural color imagery. The flying height was 120 m above the ground and the imagery was acquired with a forward and side overlap of 85%. All the raw images were PPK post-processed with senseFly's Flight Data Manager built into the eMotion software using a nearby CORS station (Site ID: NHUN) [41]. The images were further processed by Agisoft PhotoScan Pro (v.1.6.2) [42] to create a natural color orthomosaic and digital surface model (DSM) using the processing settings suggested by Fraser and Congalton [43]. The original spatial resolution of the orthomosaic and DSM was 2.31cm and 12.10 cm, respectively. The were converted to the same projection, coordinate system, horizontal, and vertical datum. Both the natural orthomosaic and CHM were rescaled to 12 cm, due to inconsistency of spatial resolution.
For validation of the ITD, the whole study site was divided into 256 plots, each of which covered a 25 × 25 m square area. A total of 60 plots were randomly selected. The individual tree detection error matrix was created by compiling the error analysis from each plot. A total of 262 trees were randomly selected from the correctly detected trees, and their reference tree crowns were manually interpreted for segmentation accuracy assessment.
Remote Sens. 2020, 12, x FOR PEER REVIEW 7 of 15 For validation of the ITD, the whole study site was divided into 256 plots, each of which covered a 25m ×25m square area. A total of 60 plots were randomly selected. The individual tree detection error matrix was created by compiling the error analysis from each plot. A total of 262 trees were randomly selected from the correctly detected trees, and their reference tree crowns were manually interpreted for segmentation accuracy assessment.

Experiment #2
Experiment #2 was conducted to evaluate the impact of accurate treetop detection on the analysis. However, manually adjusting all detected treetops from experiment #1 to make the detection accuracy reach 100% would be arduous and nearly impossible. Therefore, a subsample, that is only the treetops corresponding to the 262 reference tree crowns in experiment #1 and their neighboring treetops were manually interpreted. Consequently, experiment #2 was implemented without Section 2.6.1 of the methods, since the treetops were manually identified.

Individual Treetop Detection Results
A total of 4164 trees were detected in experiment #1, with an average height of 23.35 meters and an average crown diameter of 7.43 meters. A total of 971 trees were identified within the 60 randomly selected plots and used for the error analysis. The tree detection error matrix is presented in Table 2. The recall (r), precision (p), and F-score (F) were 74.85%, 90.42%, and 81.90%, respectively. Figure 3 shows four examples of detected treetops. The red dots indicate TPs, while the yellow and blue dots represent FPs and FNs, respectively. The FP treetops tend to occur where the branches of tall coniferous trees overhang deciduous canopies (Figure 3a), background (Figure 3a and Figure 3c), or

Experiment #2
Experiment #2 was conducted to evaluate the impact of accurate treetop detection on the analysis. However, manually adjusting all detected treetops from experiment #1 to make the detection accuracy reach 100% would be arduous and nearly impossible. Therefore, a subsample, that is only the treetops corresponding to the 262 reference tree crowns in experiment #1 and their neighboring treetops were manually interpreted. Consequently, experiment #2 was implemented without Section 2.6.1 of the methods, since the treetops were manually identified.

Individual Treetop Detection Results
A total of 4164 trees were detected in experiment #1, with an average height of 23.35 m and an average crown diameter of 7.43 m. A total of 971 trees were identified within the 60 randomly selected plots and used for the error analysis. The tree detection error matrix is presented in Table 2. The recall (r), precision (p), and F-score (F) were 74.85%, 90.42%, and 81.90%, respectively. represent FPs and FNs, respectively. The FP treetops tend to occur where the branches of tall coniferous trees overhang deciduous canopies (Figure 3a), background (Figure 3a,c), or where multiple deciduous treetops (Figure 3b) were detected. FN treetops are more likely to be found on the boundary between the deciduous and coniferous trees (Figure 3d), or where smaller trees are adjacent to larger ones (Figure 3c). where multiple deciduous treetops (Figure 3b) were detected. FN treetops are more likely to be found on the boundary between the deciduous and coniferous trees (Figure 3d), or where smaller trees are adjacent to larger ones (Figure 3c).

ITC Delineation Results
To determine the best global threshold (θ) for segmentation, the threshold value was systematically varied from 10 to 20. Only the threshold values from the top five segmentation results based on QR for experiment #1 and #2 are shown in Figure 4. In experiment #1, Oa increased, while Ua decreased, as the global threshold rose from 10 to 14. The best segmentation occurs at a global threshold of 13 based on QR. At a threshold of 13, Oa, Ua, and QR reached 0.784, 0.766, and 0.382, respectively. The Oa and Ua in experiment #2 exhibited the same pattern as seen in experiment #1, as the global threshold grew from 15 to 19. However, the best ITC delineation occurs when the global threshold reaches 17. At this threshold, Oa, Ua, and QR reach 0.811, 0.853, and 0.296, respectively.

ITC Delineation Results
To determine the best global threshold (θ) for segmentation, the threshold value was systematically varied from 10 to 20. Only the threshold values from the top five segmentation results based on QR for experiment #1 and #2 are shown in Figure 4. In experiment #1, Oa increased, while Ua decreased, as the global threshold rose from 10 to 14. The best segmentation occurs at a global threshold of 13 based on QR. At a threshold of 13, Oa, Ua, and QR reached 0.784, 0.766, and 0.382, respectively. The Oa and Ua in experiment #2 exhibited the same pattern as seen in experiment #1, as the global threshold grew from 15 to 19. However, the best ITC delineation occurs when the global threshold reaches 17. At this threshold, Oa, Ua, and QR reach 0.811, 0.853, and 0.296, respectively.  Figure 5 shows eight examples of the ITC delineation results from experiment #1 (θ = 13) and #2 (θ = 17), respectively, along with their corresponding reference crowns. It is worth noting that the sub-figures in Figure 5 cannot be shown at the same scale due to the variable crown sizes. Figure 5  (a-d) show examples of deciduous trees, while Figure 5 (e-g) are examples of coniferous trees. There is no clear distinction between the results of the deciduous trees compared to the coniferous. The accuracy is highly dependent on whether a tree's neighboring trees are accurately detected (e.g., Figure 5c vs. Figure 5f). A tree's crown tends to be under-segmented in a certain direction if its neighbor is not detected in that direction (Figure 5a, 5b, 5c, 5d and 5f)). Conversely, a tree's crown is likely to be over-segmented if multiple treetops are detected within an individual tree crown ( Figure  5e). Generally, the results of experiment #2 are visually much better than the results of experiment #1, due to the improvement in treetop detection.  There is no clear distinction between the results of the deciduous trees compared to the coniferous. The accuracy is highly dependent on whether a tree's neighboring trees are accurately detected (e.g., Figure 5c vs. Figure 5f). A tree's crown tends to be under-segmented in a certain direction if its neighbor is not detected in that direction (Figure 5a-d,f)). Conversely, a tree's crown is likely to be over-segmented if multiple treetops are detected within an individual tree crown (Figure 5e). Generally, the results of experiment #2 are visually much better than the results of experiment #1, due to the improvement in treetop detection.

Discussion
This research looked to improve upon ITC delineation using region growing segmentation, by considering the growth space between a tree and its neighbors. The improved algorithm was tested with UAS imagery collected over a mixed forest. Two experiments were conducted. Experiment #1 utilized treetops detected with a variable window filter as initial seeds, while Experiment #2 utilized manually delineated treetops as initial seeds. The Ua, Oa, and QR metrics, widely accepted for validating segmentation, were employed as accuracy measures in this study, and used to compare the results of the experiments.
The best results of experiment #1 were achieved with a global threshold of 13. At this threshold, the Oa, Ua, and QR reached 0.784, 0.766, and 0.382, respectively. After the treetops were manually adjusted and, thus, assumed to be more accurate, experiment #2 achieved its best results with a global threshold of 17 and the Oa, Ua, and QR increased to 0.811, 0.853, and 0.296, respectively. The accuracy

Discussion
This research looked to improve upon ITC delineation using region growing segmentation, by considering the growth space between a tree and its neighbors. The improved algorithm was tested with UAS imagery collected over a mixed forest. Two experiments were conducted. Experiment #1 utilized treetops detected with a variable window filter as initial seeds, while Experiment #2 utilized manually delineated treetops as initial seeds. The Ua, Oa, and QR metrics, widely accepted for validating segmentation, were employed as accuracy measures in this study, and used to compare the results of the experiments.
The best results of experiment #1 were achieved with a global threshold of 13. At this threshold, the Oa, Ua, and QR reached 0.784, 0.766, and 0.382, respectively. After the treetops were manually adjusted and, thus, assumed to be more accurate, experiment #2 achieved its best results with a global threshold of 17 and the Oa, Ua, and QR increased to 0.811, 0.853, and 0.296, respectively. The accuracy of experiment #1 is higher than that of Erikson [20], who also sought to improve the region growing algorithm by adding fuzzy rules, to segment ITCs based on aerial photographs, achieving an accuracy of 73% in the end. The accuracy of experiment #2 is comparable to that achieved by Zhen, Quackenbush, Stehman, and Zhang [21]. The authors achieved an accuracy of 91.1%, after developing an agent-based region growing algorithm to delineate ITC from ALS data, using manual treetops as seeds. However, Zhen, Quackenbush, Stehman, and Zhang [21] employed the relative error of the crown area as an accuracy measure, which is quite different from the accuracy measures in this research. Thematic accuracy assessment of remote sensing classification has been well researched [45]; however, segmentation accuracy assessment still does not have unified sampling methods and accuracy measures [46]. As the spatial resolution of remote remotely sensed data has become higher, the user community has transitioned from pixel-based image analysis towards geographic object-based image analysis (GEOBIA) [7,47]. Therefore, a unified framework validating the segmentation is highly suggested.
A comparison of experiment #2 with experiment #1 indicates that the region growing algorithm developed in this research is highly dependent on the detection accuracy of treetops, which is reasonable, given that the algorithm employed here used a tree's neighbors to define its growth space. The treetop detection recall (r), precision (p), and F-scores (F) were 74.85%, 90.42%, and 81.90%, respectively. A lower recall indicates a greater number of undetected trees. When a tree goes undetected, the growth space of one or more of its neighboring trees is allowed to expand. This results in under-segmentation as the neighbor tree can now grow larger than its actual crown. One underlying reason for this lower recall could be that the allometric equation is not local, but based on field inventory data conducted in the Piedmont physiographic province of Virginia [31]. The average crown width predicted by this allometric equation, 11.63 m, is 1.09 m higher than the average width derived from the reference tree crowns. The overestimation resulted in a wider local window filter and under detection of treetops. However, there are no published allometric equations for New England and, therefore, the equation used was the best available. Another reason could be that the tree detection algorithm assumes that tree canopies possess mountainous shape, where treetops are the locally highest in the CHM data and tree edges are lower in elevation [14,48]. This assumption works for coniferous trees, but becomes less effective for the deciduous trees that often have wide, flat crowns (Figure 3d) [49][50][51]. Additionally, the CHM produced through photogrammetry and the structure from motion (SfM) algorithm tends to underestimate the height of the upper layers of the canopy and overestimate the height of the lower layers [52,53]. The result is a smoother DSM with less vertical variability, making it harder to detect the edges between the crowns. The combination of these factors highlights the importance of analyzing the error budget of treetop detection products based on the allometric equation [54].
The reference data for validating the segmentation results is another source of uncertainty. Manually interpreting the ITCs to construct the reference dataset from the UAS is not only arduous but also full of uncertainties, especially for deciduous trees that are highly overlapped (Figure 5d). The same problem occurs when measuring tree variables in the field. Many researchers have used the concept of "ground truth" to validate UAS data [55,56]. However, data collected on the ground can never truly be 100% accurate and, thus, cannot be considered the "truth" [45]. As UAS has become common for collecting reference data, to determine whether the data can be used as a reference or how much uncertainty exists in the data becomes extremely important, and therefore needs further research.
The region growing algorithm developed in this research can be easily applied to other data sources, such as multispectral or point cloud data. However, there are limitations to this algorithm that require further research. First, this study utilized a sequential order to complete the growing procedure. A pixel that has been grown by the previous seed would not allow the following seeds to grow this pixel [22]. Although a unique allocation was assigned to each detected tree, to avoid being taken by other trees, a simultaneous growing strategy could achieve better accuracies [22]. Second, this algorithm did not consider the species information. A local competition scheme for space exists between different tree species [30]. Additionally, the relationship between crown width and height may vary from one species to another [31,34]. Adding this information on competition and species specific allometric equations could have helped to improve the tree detection and growth space allocation in this study. Third, although the region growing algorithm is generally better than watershed and valley-following algorithms for segmentation [57], it is time-consuming. Experiment #1 took 8 h and 26 min, using a laptop workstation with E-2176M 6 core processor and 32GB of memory. How to choose the best features from data and combine them with this growth space requires further research.

Conclusions
This study developed a region growing algorithm that took into consideration the growth space between neighboring trees, while segmenting ITCs. The algorithm was implemented on natural color imagery derived from UAS. Results showed treetop detection accuracies, recall, precision, and F-score, were 74.85%, 90.42%, and 81.90%, respectively. Segmentation accuracies, Oa, Ua, and QR, reached 0.784, 0.766, and 0.382, respectively. The Oa, Ua, and QR were increased to 0.811, 0.853, and 0.296, respectively, when the treetops were manually updated to improve the detection accuracy. The segmentation accuracy is highly dependent on the treetop detection accuracy. The sources of uncertainties, such as the allometric equation utilized, accuracy measures, and manual interpretation, were analyzed. A unified framework validating the segmentation was highly suggested. The region growing algorithm developed in this research can be easily applied to other data sources to achieve higher accuracy. Uncertainties and limitations, including growing order, species specific growth models, and ecological competition schemes of trees, were addressed.