Tree Species Traits Determine the Success of LiDAR-Based Crown Mapping in a Mixed Temperate Forest

The ability to automatically delineate individual tree crowns using remote sensing data opens the possibility to collect detailed tree information over large geographic regions. While individual tree crown delineation (ITCD) methods have proven successful in conifer-dominated forests using Light Detection and Ranging (LiDAR) data, it remains unclear how well these methods can be applied in deciduous broadleaf-dominated forests. We applied five automated LiDAR-based ITCD methods across fifteen plots ranging from coniferto broadleaf-dominated forest stands at Harvard Forest in Petersham, MA, USA, and assessed accuracy against manual delineation of crowns from unmanned aerial vehicle (UAV) imagery. We then identified treeand plot-level factors influencing the success of automated delineation techniques. There was relatively little difference in accuracy between automated crown delineation methods (51–59% aggregated plot accuracy) and, despite parameter tuning, none of the methods produced high accuracy across all plots (27—90% range in plot-level accuracy). The accuracy of all methods was significantly higher with increased plot conifer fraction, and individual conifer trees were identified with higher accuracy (mean 64%) than broadleaf trees (42%) across methods. Further, while tree-level factors (e.g., diameter at breast height, height and crown area) strongly influenced the success of crown delineations, the influence of plot-level factors varied. The most important plot-level factor was species evenness, a metric of relative species abundance that is related to both conifer fraction and the degree to which trees can fill canopy space. As species evenness decreased (e.g., high conifer fraction and less efficient filling of canopy space), the probability of successful delineation increased. Overall, our work suggests that the tested LiDAR-based ITCD methods perform equally well in a mixed temperate forest, but that delineation success is driven by forest characteristics like functional group, tree size, diversity, and crown architecture. While LiDAR-based ITCD methods are well suited for stands with distinct canopy structure, we suggest that future work explore the integration of phenology and spectral characteristics with existing LiDAR as an approach to improve crown delineation in broadleaf-dominated stands.


List of
Five automated LiDAR-based individual tree crown delineation routines were evaluated in this study. ǂ Four routines are surface-based methods applied to rasterized canopy height models. ψ The fifth routine is a 3D method applied to a point cloud. All routines were implemented in the R package lidR, developed by Roussel Table 4: Results from multivariate models assessing the influence of plot-level metrics on overall accuracy of five automated crown delineation methods. The table includes (DALPONTE). Points are colored to show how each relationship is co-related to the fraction of conifer crown area per plot (conifer fraction). The relationship between accuracy and evenness (J) was significant (p<0.05) across all methods. However, the relationship between accuracy and aggregation index (AGI) was only significant for DALPONTE AND SILVA, and the relationship between accuracy and rumple index was only significant for DALPONTE, SIVLA and LI. .  I would like to especially thank my advisor, Dr. Scott Ollinger, for responding to my email inquiry for undergraduate research experiences back in the winter of 2014. I did not know it at the time but joining the Terrestrial Ecosystem Analysis Lab was one of the best opportunities I have had. I sincerely believe that I have learned more and made more true connectionsboth academic and lifelong friendshipsduring my years with TEAL than anywhere else during my academic journey. I also would like to thank Scott for providing me with a template for how to have an impactful scientific career while still pursuing all the other passions in life. Thank you for accepting me as a student even after I ran away multiple times to go on adventures of my own. Who knows, maybe I'll be back again one day.
I would like to thank all past and present members of TEAL that I have had the pleasure to do field and lab work with throughout the years. While I always take pleasure in the solitude of being in the woods alone, field work was undoubtedly always made better by their banter and witticism, especially when hauling 50lbs of gear up to the 10T plot at Bartlett. Nowhere else could I find a group of people that would allow me to climb 100 ft towers or shoot shotguns, and somehow learn a whole lot in the process.
I would like to highlight Lucie Lepine, especially. Lucie put unwavering faith in my fieldwork ability and allowed me to assume responsibilities within TEAL that pushed me to grow both as a scientist and as human. She has been a true friend, and a remote sensing guru for me to turn to when I was perplexed.
Thank you to Dr. Rebecca Sanders-DeMott and Frankie Sullivan. Both Rebecca and Frankie have been early career scientist role models for me. Their willingness to share knowledge, skills, and insight has been an invaluable resource to me during this thesis. Thank you to Frankie for helping me get running with R scriptingit turns out I really enjoy writing lines of code. Thank you to Rebecca for her master-craft wordsmithingher edits and revisions have helped turn my bearish writing into something far more eloquent.
Thank you to Dr. Andrew Ouimette. Andy allowed me to steal countless hours of his time to discuss my thesis (and any number of tangential hair-brain ideas). His insight has been invaluable from the genesis of my thesis, and he should be given substantial credit for helping me shape this research project. We all know the final product is a near unrecognizable creature to what it started asand I think all the better for it. I really believe a large part of the reason this thesis was so successful was because I was able to share an office with Andy (and Rebecca). I had complete freedom to openly discuss my own ideas, but I was also included as an equal in other academic discussions. Beyond being a mentor, Andy has also been a great friendone viii who was just as willing to share a meal as to help fix a carand I value his friendship far beyond his academic mentorship.
Thank you to my expanding family -Hastings and Langley. It's a blessing that they get along so well and continue to do things together even though Kate and I are now on the other side of the country. Mom and Dad, thank you for absolutely everything you've done to get me to this point. Tina and John, thank you for welcoming me into your family.
And of course, most of all thank you to my amazing partner, Kate. I've said it before but isn't it funny how I had to go halfway around the world to find out that my Love was living so close by for so long and I didn't know it? On to our next adventure. Automated individual tree crown delineation (ITCD) via remote sensing platforms offers a path forward to obtain wall-to-wall detailed tree inventory/information over large areas. While LiDAR-based ITCD methods have proven successful in conifer dominated forests, it remains unclear how well these methods can be applied broadly in deciduous broadleaf (hardwood) dominated forests. In this study, I applied five common automated LiDAR-based ITCD methods across fifteen plots ranging from conifer-to hardwood-dominated at the Harvard Forest in Petersham, MA, USA, and assess accuracy against manually delineation crowns. I then identified basic tree-and plot-level factors influencing the success of delineation techniques. My results showed that automated crown delineation shows promise in closed canopy mixed-species forests.
There was relatively little difference between crown delineation methods (51-59% aggregated plot accuracy), and despite parameter tuning, none of the methods produce high accuracy across all plots (27 -90% range in plot-level accuracy). I found that all methods delineate conifer species (mean 64%) better than hardwood species (mean 42%), and that accuracy of each method varied similarly across plots and was significantly related to plot-level conifer fraction.
x Further, while tree-level factors related to tree size (DBH, height and crown area) all strongly influenced the success of crown delineations, the influence of plot-level factors varied. Species evenness (relative species abundance) was the most important plot-level variable controlling crown delineation success, and as species evenness decreased, the probability of successful delineation increased. Evenness was likely important due to 1) its negative relationship to conifer fraction and 2) a relationship between evenness and increased canopy space filling efficiency.
Overall, my work suggests that the ability to delineate crowns is not strongly driven by methodological differences, but instead driven by differences in functional group (conifer vs. hardwood) tree size and diversity and how crowns are displayed in relation to each other. While LiDAR-based ITCD methods are well suited for conifer dominated plots with distinct canopy structure, they remain less reliable in hardwood dominated plots. I suggest that future work focus on integrating phenology and spectral characteristics with existing LiDAR approaches to better delineate hardwood dominated stands.

Introduction
Individual tree crown delineation (ITCD) via remote sensing platforms offers a path forward to obtain wall-to-wall detailed tree inventory/information over large areas. ITCD has been used to map species (Shi et al., 2018), biodiversity (Zhao et al., 2018), and carbon stocks (Coomes et al., 2017), as well as to quantify tree structural  and spectral characteristics (Clark et al., 2005). While manually delineating crowns from high resolution imagery provides accurate measurements for small scale studies (Asner et al., 2002;Clark et al., 2005;Fang et al., 2018), effective automated methods are necessary if efforts are to be scaled to larger geographic regions. An ideal crown delineation method would be broadly applicable across stands varying in structural and compositional complexity. Given that many forests across the globe are under increasing pressure from climate change (Rustad et al., 2012), invasive pests (Crowley et al., 2016), and land-use change (Houghton, 1995), reliable methods for measuring and mapping forests takes on additional urgency. Despite this need, broad-scale application of automated ITCD techniques remains difficult and unreliability is uncertain.
Considerable work has been done to develop and improve automated ITCD techniques (Ayrey et al., 2017;Dalponte and Coomes, 2016;Jing et al., 2012;Li et al., 2012;Lu et al., 2014;Silva et al., 2016a;Wan Mohd Jaafar et al., 2018;Zhen et al., 2015). Light Detection and Ranging (LiDAR) crown delineation methods tend to be favored over spectral methods because they are not impaired by shadow and illumination artifacts (Dalponte et al., 2015), and because of the ability to directly measure crown architecture (Zhen et al., 2016). However, reported accuracies of different LiDAR-based methods is wide-ranged (Lu et al., 2014), and the success of their application is largely controlled by the structure of the forest of interest (Vauhkonen et al., 2012).
The structure of an individual crown and its position relative to neighboring crowns has a direct bearing on the success of ITCD. Crown architecture controls leaf display (Valladares and Niinemets, 2007), and trees must balance resource acquisition (e.g. light) with mechanical constraints (e.g. buckling under its own weight; Chave et al., 2009;Horn, 1971). Tree crown form is also plastic (Forrester et al., 2017;Muth and Bazzaz, 2003;Pretzsch, 2014;Valladares and Niinemets, 2007) and crown shape is a response to spatio-temporal variation in facilitative and competitive interactions with neighboring trees (Fichtner et al., 2017;Givnish, 2002), as well as a function of site history and disturbance (Forrester et al., 2017;Oliver and Stephens, 1977).
Despite the seemingly stochastic and complex nature of crown and stand structural development, there are also characteristic differences between needle-leaf evergreen (conifer) and deciduous broadleaf (hardwood) plant functional types that influence ITCD. Conifers and hardwoods exhibit differences in physiological traits and adaptation to resource acquisition, disturbance and stress (Augusto et al., 2014;Brodribb et al., 2012) that manifest in difference in crown shape and stand arrangement. LiDAR-based ITCD methods have been successfully applied in conifer dominated systems Silva et al., 2016b;Wang et al., 2016), while hardwood dominated systems tend to be more challenging (Broadbent et al., 2008;Zhen et al., 2016). Discrepancies in accuracy of ITCD methods between conifer and hardwood systems is often attributed to the characteristic plagiotropic growth form (ellipsoidal or umbrella-shape) of hardwood crowns that make it difficult to identify tree tops, differentiate neighboring crowns, and group split canopies of an individual crown (e.g. Lu et al. 2014).
Despite the challenges, there is a need for ITCD in many regions dominated by hardwoods and mixed stands. The temperate forests of the northeastern United States are typically characterized by dense mixed species stands with closed canopies, where crowns often overlap and have irregular shape. Given the complexity of the forests and the dominance of hardwood trees, it remains unclear the degree to which automated ITCD techniques can be employed in the region, or what the best ITCD approach would be. Here, I applied a series of automated LiDAR-based ITCD methods across plots ranging from conifer to hardwood dominated. I identified basic tree-and plot-level factors influencing the success of delineation techniques. Finally, I comment on how the ecology of conifer and hardwoods might best be exploited to delineate trees in temperate forests.

Site Description
This study was conducted in a Smithsonian Forest Global Earth Observatory plantations, and a 3-ha swamp (Orwig and Ellison, 2015). The age structure is dominated by 75-125 year old second growth forest (Plotkins et al., 2015). Dominant species include red oak (Quercus rubra), red maple (Acer rubrum), eastern hemlock, and white pine (Pinus strobus).
Other common species include Norway spruce (Picea abies), American beech (Fagus grandifolia) and birch (Betula spp.). Between 2010 and 2014 a census of the MegaPlot was conducted, where all woody stems ≥ 1 cm were mapped, measured, and identified to species (http://harvardforest.fas.harvard.edu:8080/exist/apps/datasets/showData.html?id=hf253) (Orwig and Ellison, 2015). Height of all stems were calculated using site-specific allometric equations . In 2018, I remotely established fifteen 20 m radius plots across the MegaPlot (supplemental Figure 11). Plots were selected to capture a full range of tree functional composition from conifer dominated to hardwood dominated. Each UAV image was aligned with the G-LiHT imagery with 20 control points, and transformed using a first-order polynomial. The resulting georeferenced UAV images were found to be in good visual agreement and tree crowns aligned with those visible in the G-LiHT hyperspectral and LiDAR imagery and field-measured stem locations.

Crown Delineation
All tree crowns visually distinguishable within the fifteen plots were manually delineated by onscreen digitizing of the September 13 th UAV image. This study excluded understory crowns not visible within UAV imagery. Manual delineation of individual tree crowns (MITC) was done with a stylus pen using the FreehandEditing plugin in QGIS. While crown digitization was performed on the September 13 th image for consistency, multiple dates of imagery (September 13 th , October 5 th , October 12 th , and November 4 th ) were used to help distinguish crowns and identify the species of each crown based on differences in shadow and phenology (Figure 2).

Figure 2:
Manual crown delineation was performed using high resolution UAV imagery. All delineations were done on the September 13 th image (left panel), but other dates of imagery were used to help differentiate crowns growing in close proximity. The right panel (October 12 th ) gives an example of phenologic differences between species that can be leveraged to help separate crowns that might otherwise be clumped during manual interpretation.
MITC species label and associated stem attributes (DBH and allometrically derived tree height) were assigned manually during the digitization process from the ForestGEO stem data. In rare cases where a crown could conceivably belong to one of multiple stems from either the same species or stems from different species that could not be distinguished using phenology and textural cues, the crown was assigned to the stem with the higher allometrically derived tree height. Crown area and maximum CHM-derived crown height were calculated for each MITC.
Using MITC crowns, conifer fraction of each plot was calculated as the ratio of conifer crown area to hardwood crown area.
I tested five automated individual tree crown (AITC) delineation techniques ( Table 1) available in the R (v. 3.5.1; R Core Team, 2018) package (Roussel and Auty, 2019). Four routines are surface-based methods applied to a rasterized CHM, and the fifth is a 3D method applied to a LiDAR point cloud. Dalponte2016 (DALPONTE) is a surface-based seed and region growing method (Dalponte and Coomes, 2016). Silva2016 (SILVA) is a surface-based seed and voronoi tessellation method (Silva et al., 2016a). Simple Watershed (SWS) is an a surface-based watershed segmentation (Vincent and Soille, 1991). Marker-controlled Watershed (MCWS) is a watershed segmentation that relies on a priority seed map. Li2012 (LI) is a 3D region growing method applied to a point cloud . All techniques were run using the lastrees function. Treetop priority seed points used in DALPONTE, SILVA, and MCWS were created with the tree_detection function using the lmf (local maximum filtering) algorithm (Popescu and Wynne 2013). SWS did not rely on a priority seed map, and LI has a tree top detection built into the function. While four of the five routines are surface-based methods applied to CHM, by default, all methods segment the point cloud. Final AITC polygons were generated using the tree_hulls function, by creating a 2D concave hull around the segmented point cloud. I chose not to smooth CHM data (e.g. Gaussian filtering) prior to crown delineation analyses. My preliminary results showed smoothing either made no marked improvement on delineation success, or, in certain cases, decreased overall accuracy of the methods. Table 1: Five automated LiDAR-based individual tree crown delineation routines were evaluated in this study. † Four routines are surface-based methods applied to rasterized canopy height models. ‡ The fifth routine is a 3D method applied to a point cloud. All routines were implemented in the R package lidR, developed by Roussel and Auty (2019).

Crown Delineation Routine Reference
Dalponte2016 ( I then tuned each technique's input parameters to find 1) the best plot-tuned parameterspotentially unique parameters that maximized plot-level accuracy and 2) the best generalized parametersa single set of parameters that achieved the highest accuracy when evaluated across all 15 plots. Parameter tuning was done using a bootstrapping approach, where, during each iteration, input parameters were randomly selected within a predefined range. Following each delineation iteration, accuracy was assessed by comparing the generated AITC polygons to the reference MITC delineations. Automated delineations were paired to manual delineations so that any given MITC was labeled as either correctly or incorrectly delineated. A detection accuracy score (DA) was assigned to each iteration: where, is the number of correctly delineated AITC and N is the number of MITC (Yin and Wang, 2016). A given AITC was considered correctly delineated (true positive) if ≥ 50% of the area of both AITC and MITC overlap (Figure 3; e.g. Lamar, McGraw, and Warner 2005;Leckie et al. 2004). Accuracies were recorded as plot-level accuracies, and as well as overall accuracyaggregated across all 15 plots. Each routine except LI was iterated 500 times. LI was only iterated 200 times because it was substantially slower than the surface-based methods and because maximum accuracy achieved did not improve beyond the first 100 iterations. I retained tuning iterations for the highest generalized parameter accuracy and the highest plot-tuned accuracy for each automated crown delineation.
To further understand how each method performed at the crown-level, I characterized the incorrect AITC delineations by type of error. Therefore, each crown was ultimately assigned one of four categories based on their overlap with MITC (Figure 3) : A) Over-segmentation: The intersecting area between AITC and MITC is greater than or equal to 50% of the area of only AITC.
B) True Positive: The intersecting area between AITC and MITC is greater than or equal to 50% of the area of both AITC and MITC (as defined above).
C) Under-segmentation: The intersecting area between AITC and MITC is greater than or equal to 50% of the area of only MITC.
D) False Positive: The intersecting area between AITC and MITC is greater than or equal to 50% of the area of neither AITC and MITC.
Given that any MITC can only be linked to one AITC, in the case were multiple AITC crowns fell within a single MITC (as is the case with over-segmentation), the MITC was assigned to the AITC that best overlapped with the particular MITC identified based on the AITC crown that maximized the sum of IA and IM, where IA is the ratio of AITC:MITC intersection area to AITC area, and IM is the ratio of AITC:MITC intersection area to MITC area. shown with green fill) and assigned into one of four categories based on overlapping area: a) Over-segmentation: The intersecting area between AITC and MITC is greater than or equal to 50% of the area of only AITC. b) True Positive: The intersecting area between AITC and MITC is greater than or equal to 50% of the area of both AITC and MITC. c) Under-segmentation: The intersecting area between AITC and MITC is greater than or equal to 50% of the area of only MITC. d) False Positive: The intersecting area between AITC and MITC is less than 50% of the area of both AITC and MITC.

Statistical Analysis
To understand the factors that influenced automated crown delineation I calculated multiple metrics used to describe tree-level attributes (DBH, crown height, and crown area), and plot-level vertical and horizontal structural and compositional complexity (canopy complexity, uniformity of crown spacing, relative density, trees per plot, and species diversity). Plot-level metrics only included stem attributes associated with MITC data.
Plot canopy complexity was estimated using the Rumple Index (Kane et al., 2008)a ratio of canopy surface area to projected ground area. Uniformity of crown spacing-an aggregation index (AGI) developed by Clark and Evan (1954) was calculated from MITC centroids as described by Pommerening (2002). Relative stem density was calculated using a mixed-species relative density equation (Ducey and Knapp, 2010). Trees per plot (TPP) was calculated as the number of MITC per plot. Species diversity was calculated using Shannon's Diversity Index (H), Pielou's Evenness Index (J), and species richness (Heip et al., 1998). All predictor variables were standardized to have a mean of zero and a standard deviation of one by subtracting the mean and dividing by one standard deviation (McCune and Grace, 2002).

To identify important plot-level variables I performed univariate linear regressions
between all plot-level metrics and plot-tuned accuracy (n = 15) for all five crown delineation routines, and I retained any variable found to be significant (α < 5%) in at least one regression. I then built global multiple linear regression model including all significant variables from the univariate analyses. Multicollinearity was evaluated using variance inflation factor (VIF), and I removed highly inter-correlated variables until VIF of all variables was <10 (Hair et al., 1995).
The best model for plot-level performance was chosen using a corrected Akaike Information Criterion (AIC) to account for small sample size (Burnham and Anderson, 2002).
Finally, I built mixed-effect logistic regressions to help understand which of the tree-and plot-level factors influenced the odds that each MITC would be correctly delineated as a linear function of covariates in a logistic regression (Oberle et al., 2018). Logit models were built in the R package lme4 (Bates et al., 2015). Each global model included tree-level variables and the plot-level variables found to be significant during the univariate analyses described above. I controlled for plot-level variability by including plot as a random effect in each model. Model selection was performed by backward elimination from the global model, and the final model was chosen by minimum AIC (Burnham and Anderson, 2002). I took the number of times a variable was included across the five models as an indication of the importance of that variable on crown delineation.
Model accuracies were evaluated using a 10-fold cross validation, where the developed logistic relationships were each trained on 90% of the data and tested on the remaining 10%.
Training and testing were performed on all 10 folds of data and the results were averaged to give an estimate of each model's accuracy.

Manual Crown and Plot Characteristics
I manually delineated 650 tree crowns from 14 unique species. Of those, 379 were conifer crowns, and 271 were hardwood crowns. The range in height, DBH, and crown area were comparable between conifer and hardwoods. On average conifers were taller and had larger DBH (Figure 4), while median conifer crown area was 27% smaller than hardwood crowns.

Differences in methods and influence of parameter tuning
The influence of generalized parameter tuning compared to default parameters varied by method ( Table 2). LI improved by 17% to achieve the highest generalized parameter accuracy (55%). SWS was particularly sensitive to parameterization, and overall accuracy improved from 8% to 49% compared to default parameters. In contrast, MCWSwhich differs from SWS only in having a priority seed pointwas relatively robust against parameterization tuning. MCWS achieved 49% overall accuracy with default parameters and only improved by 6% following tuning. SILVA and DALPONTE were similarly robust and generalized tuning of parameters only marginally improved accuracy (+ 1-2%). While further plot-tuning of method parameters only marginally improved overall accuracy scores (+2-6%), I chose to continue the analyses using plot-tuned results because plot-level accuracy (supplemental Table 5) improved by as much as 36% (LI) and because I was interested in understanding the factors that influenced the highest quality delineations.

Overall and plot-level accuracy
Following plot-tuning overall accuracy and plot-level accuracy did not vary substantially across delineation methods. Overall accuracy ranged from 51% by SWS to 59% by LI. Though LI was marginally more accurate (+4%) than the second highest overall accuracy (MCWS: 55%), it came at a substantial increase in processing time and complexity of input parameters (and necessarily require parameter tuning to achieve high accuracy).
Plot-level accuracy ranged from 27% (DALPONTE and SWS) to 90% (MCWS), and the difference between the most-and least-accurately delineated plot was >40% for all methods.
Plot-level accuracy was similarly wide-ranged for all methods (supplemental Figure 10), and significantly related to conifer fraction (p <0.05) for all methods.

Differences in accuracy across species
All methods more accurately delineated conifer crowns (mean 64%) than hardwood crowns (mean 42%). Each method had trade-offs in accuracy at the species level, and no single method stood out has having the highest accuracy across all species (Figure 5). For example, SILVA delineated red pine especially well (81%), but had consistently low hardwood accuracy scores. While SWS, which had the lowest red pine accuracy (53%), excelled at delineating red oak in comparison to other methods (+9%).
Other hardwood species (birch spp., black oak, white ash, and black cherry) accuracy ranged from 40% (LI) to a low of 15% (DALPONTE) Figure 5: All automated crown delineation methods showed similar species level accuracy. Generally, conifer species (eastern hemlock, red pine, white pine and spruce) were more accurately delineated than hardwood species.

Linear regressions
Five plot-level variables (J, TPP, rumple, H, and AGI) were found to be significant (p < 0.05) in at least one univariate regression (Figure 6Error! Reference source not found. between accuracy and J was significant (p < 0.05) for all models (supplemental material: Table   4).

Figure 6:
Linear regression analysis for plot-level variables and accuracy of one of the automated crown delineation methods (DALPONTE). Points are colored to show fraction of conifer crown area per plot (conifer fraction). The relationship between accuracy and evenness (J) was significant (p<0.05) across all methods. However, the relationship between accuracy and aggregation index (AGI) was only significant for DALPONTE AND SILVA, and the relationship between accuracy and rumple index was only significant for DALPONTE, SIVLA and LI.

Logistic Regressions
Global logit models consisted of tree-level variables (DBH, height, and crown area) and the plot-level variables (rumple, J and AGI) identified in the linear regression analyses. Results of the final logit models are shown in Table 3. Cross validation model accuracy ranged from 61% (MCWS) to 70% (SWS), suggesting that while I captured the most impactful variables in predicting crown delineation, there may be addition factors unaccounted for in my analysis.
There was no single variable that was included in all five models. However, all but one model (SWS) consisted of at least one tree-level variable related to tree size and one plot-level variable related to tree arrangement.

Discussion
Automated crown delineation remains difficult to apply in closed canopy mixed-species forests. Despite parameter tuning, none of the methods produced high accuracy across all plots, and there was relatively little difference between crown delineation methods. I found that all methods delineate conifer species better than hardwood species, and that accuracy of each method varied similarly across plots. Thus, it is evident that the ability to delineate crowns is not strongly driven by methodological differences, but instead driven by differences in conifer and hardwood functional groups. Conifers and hardwoods have developed traits that distinguish their ability to compete for resources and respond to disturbance and competition. In turn these traits influence tree height, crown architecture (crown spreading and leaf-display), and how crowns interact with neighboring crowns.

Tree Architecture
Tree size I found that taller trees and larger diameter trees were more likely to be correctly delineated. This is in part because large trees often hold dominant positions in the canopy and tend to have more symmetrical crown shape (Muth and Bazzaz, 2003). Yet, this is also because conifers in the canopy tended to be taller and have larger diameters (Figure 4). Conifer species identified on the plots have lower average wood density (specific gravity) than the hardwood species (Ducey and Knapp, 2010), which is energetically efficient for height growth (Anten and Schieving, 2010;Horn, 1971). In higher diameter size classes conifers diverge from hardwoods, growing taller (Ducey, 2012).

Figure 7:
G-LiHT LiDAR point cloud comparison highlighting the differences in structure between a hardwood dominated stand (A) and a conifer dominated stand (B). Warmer colors represent higher points in the canopy. The conifer dominated stand exhibits higher canopy rumple, and uniformity of crown shape. The conical, less-plastic shape of conifer crowns may also reduce canopy space filling efficiency.
Conifers, especially white pine, are larger (diameter and height) because of site history and growth strategy. Much of the northeastern United States landscape has been shaped by historical land use (Foster et al., 1998;Thompson et al., 2013). White pine are successful colonization of disturbed sites, and many of the large white pine are old-field pines that invaded agricultural and pastoral fields following abandonment in the mid-1800s (Abrams, 2001). Low density wood, and relatively high photosynthetic rates (Anten and Schieving, 2010;Brodribb et al., 2012) allow pine to achieve rapid vertical growth, and they continue to avoid direct competition by occupying a higher canopy stratum than hardwoods. On a canopy height model, emergent white pine appear as hotspots (supplemental Figure 13) because they often stand five or more meters above the continuous canopy; thus, they are easily detected and delineated by automated crown delineation methods.
Crown Spread I found smaller crowns were more likely to be successfully delineated, and similar to height this is likely related to differences between conifers and hardwoods. Many mid-and shade tolerant hardwood species have weak apical control that results in plagiotropic growth forms (Pretzsch and Rais, 2016). Weak apical control allows multiple stems to compete for a dominant terminal position, the result of which can be a broad and flat crown, often with forked trunks and multiple differentiated sections within a single crown (i.e. crown splitting).
Conifer crowns tend to spread less than hardwood crowns, though it is possible to find white pine or hemlock that are comparable in spread to hardwood crowns. However, conifers maintain a more rigid, apically controlled growth form, and are less likely to exhibit crown plasticity (Strigul et al., 2008;Vincent and Harja, 2008). This results in a singularly defined orthotropic bole and the characteristically conical crown shape, and it far rarer to find conifers with forked trunks and split crowns. The ability to spread branches laterally is associated with wood density. Wood density is correlated with structural properties, including resistance to splitting, rupture stress, dynamic breakage, and increased elasticity (Chave et al., 2009). While low density wood is a lower carbon-cost approach to attain vertical expansion, hardwood species with denser wood can expand lateral branching without compromising structural integrity (Anten and Schieving, 2010;Horn, 1971). This is in agreement with crown radius -DBH allometric equations developed by  at the Harvard Forest. They found the crown radius slope to be steeper for hardwoods than conifers, and that this relationship was related to wood specific gravity.
Red oak, in particular, often have substantial crown spread and split crowns. This type of architecture presents two major challenges for automated tree crown delineation: 1) It is difficult to define a singular local maximum and 2) crowns either interdigitate with neighboring crownsresulting in under-segmentation, or crowns splitresulting in over-segmentation. I found all methods most often over-segmented red oak (Figure 8). My results agree with other studies that found hardwood canopies are often over-segmented (Zhen et al., 2016).

Mechanical interaction
Mechanical interactions between neighboring crowns is another major dynamic controlling lateral branch expansion, perhaps even more than resource competition (Hajek et al., 2015). Crown shynessgaps that form between adjacent crowns, often of the same speciescan result from mechanical bud abrasion and branch damage during crown collisions (Putz et al., 1984). While mechanical interactions occur between all adjacent crowns in closed-canopy stands, canopy gap persistence (i.e. crown shyness) is controlled by branch fragility and rates of regrowth following lateral branch damage (Hajek et al., 2015).
Crown shyness is especially visible in red pine dominated plots (supplemental Figure   14), which were placed in an even-aged remnant pine plantation (Rainey et al., 1999). Crown shyness is a common occurrence in even-aged conifer dominated stands (Goudie et al., 2009), and shyness likely contributed to not only the high accuracy in these plots (as high as 80%), but also the fidelity of the delineations, because gaps between adjacent crowns creates defined borders for delineation (Figure 8). In comparison to hardwood species with strong, dense branches (e.g. red oak), red pine is more susceptible to collision damage. High height:diameter ratios coupled with low wood density make the crowns of red pine susceptible to wind damage (Wonn and O'Hara, 2001) through increased crown mobility and resulting high-impact crown collisions (Rudnicki et al., 2001).

A traits perspective
Major differences in tree architecture between conifers and hardwoods stem from differences in underlying traits. While in direct competition, hardwoods often outcompete conifers in nutrient rich environments (Oliver and Larson, 1996), conifers have evolved different trait adaptations to disturbance and tolerance to stress (Brodribb et al., 2012) that allow them to persist (and sometimes outcompete hardwoods) in temperate forests. Within the 'fast-slow' plant economic spectrum proposed by Reich (2014) many of the traits exhibited by conifers would be considered slow in comparison to hardwood traits.
Conifersmany of which have evolved in resource poor conditionsoften invest in long-lasting low-nitrogen (N) foliage (Gower et al., 1995). Convergent leaf-and canopystructural properties (conical crown shape, clumped foliage) promote light scattering and more even/diffuse light conditions throughout the canopy, which in turn increasing radiation use efficiency of low foliar N species (Cohen and Pastor, 1996;Gower et al., 1995). Further, Ollinger (2011) pointed out that plants grown (or adapted to grow) in resource poor conditions allocate fewer resources to wood vs foliage, constraining crown spread.
In contrast, hardwoods have developed a fast strategy where they invest in costly high-N deciduous foliage which turns over annually. To pay for the high carbon-cost investment, hardwoods must maximize direct light interception. Mid-and shade-tolerant hardwoods (red oak, red maple) achieve this by spreading their crowns to maximize foliage display on a more even plane.
Within Reich's (2014) plant economic spectrum fast-trait species should have lower density wood optimized to transport water, and one might also expect fast-trait species to reach taller heights to optimized high-N leaf display. However, conifers and hardwoods have different wood anatomy (e.g. tracheids vs. xylem vessels for water transport), which makes direct comparison difficult (Brodribb et al., 2012). Further, while less dense wood of conifers allows comparable (or greater) height growth, it does so at a lower carbon-cost, which allows more carbon investment elsewhere (e.g belowground in nutrient poor environments; Gower et al., 1995).
There is also considerable variation in traits within each functional group (supplemental Figure 15). While in comparison to hardwoods white pine may be considered slow, within conifers, white pine is undoubtedly fast, with higher foliar nitrogen, shorter leaf life-span and low-density wood. At HF, white pine can diverge from the typical conical shape seen in other conifers, displaying spreading and flat-topped crowns. If it were not for other characteristics (e.g. occupying a higher canopy stratum) it may have been more difficult to delineate pine. Within hardwoods, early successional species (aspen, birch) have comparably high foliar N and low wood density. Where oaks and maples spread, these species invest in rapid vertical growth and very modest crown spread, and this combination of traits may result in easier ITCD, though this study did not permit me to investigate this. Thus, it appears that while the 'fast-slow' traits perspective provides an interesting lens to view crown architecture as it relates to crown delineation, there is considerable variation in traits, and perhaps even with inverse relationships within functional groups.
Species Evenness I found that species evenness was the most important plot-level variable controlling crown delineation success. As species evenness decreased, the odds of successful delineation increased. Evenness was likely important because of 1) its negative relation to conifer fraction, and 2) a relationship between evenness and canopy space filling efficiency.
There was a strong relationship between species evenness and conifer fraction (supplemental Figure 9); the least even plots had the highest conifer fraction while the most even plots tended to have the lowest conifer fraction. It is important to note that two of the low evenness conifer plots were artificial in the sense that they are remnant red pine plantation, though red pine can grow naturally in monoculture. However, the other low evenness conifer plot was in a natural mature hemlock stand, a common occurrence in temperate forests (Small et al., 2005). It is not uncommon for conifer stands to have low evenness because of generally lower diversity (Augusto et al., 2014) of conifer species, and because needles of conifer species have high C:N ratio that can alter soil fertility conditions and deter hardwood establishment and growth (Brodribb et al., 2012).
Despite the evident influence of conifer fraction, the evennessaccuracy relationship may also be reflective of increased efficiency of canopy space filling (i.e. crown packing) in higher diversity plots. Recent work has shown that crown packing increases with species diversity (Jucker et al., 2015;Pretzsch, 2014), and that neighborhood species diversity also has a positive impact on individual tree productivity (Fichtner et al., 2018(Fichtner et al., , 2017. In low diversity stands, trees from the same species compete similarly for growing space (sensu Oliver & Larson, 1996), while in higher diversity stands, niche partitioning and complementarity of crown architecture promote partitioning of resources (Morin et al., 2011;Sapijanskas et al., 2012), allowing more efficient and complete use of available canopy space (Pretzsch and Schütze, 2016;Williams et al., 2017). As plot diversity increases crown packing increases, and it becomes increasingly difficult to differentiate neighboring crowns (Figure 7).
To further investigate the potential relationship between species evenness and crown packing, I calculated plot NDVI from the G-LiHT hyperspectral data as a proxy estimate of leaf area index and foliar density (Qiao et al., 2019), assuming increased crown packing would be related to increased LAI. I found evenness is strongly related to NDVI (p < 0.001; R 2 : 0.81).
However, because NDVI is also related to conifer fraction (Waring et al., 1995), I performed a partial correlation test. I found that after accounting for conifer fraction, NDVI was still positively correlated (r: 0.58) with species evenness, lending support to the idea that the evenness -accuracy relationship is both a result of conifer fraction and increased crown packing in higher diversity plots.
A silver lining: where do these methods work?
I found automated LiDAR ITCD methods show great promise for delineation of large trees. Despite lower accuracy for smaller size trees, these results are encouraging given the important role large trees play in terrestrial ecosystems (Freckleton and Watkinson, 2001), especially in terms of carbon accumulation (Stephenson et al., 2014). I was able to delineate 62-70% of all trees ≥ 40 cm DBH, which is promising for the prospect of tree-centric carbon mapping (Coomes et al., 2017;Dalponte and Coomes, 2016).
I also found these methods to perform especially well in conifer dominated stands. In particular, the current ability to delineate mature eastern hemlock has implications for current research and conservation interests in monitoring and mapping hemlock wooly adelgid (HWA) infestations (Orwig et al., 2012). Given the impact HWA has on the structure and composition of infested forests (Small et al., 2005), the scientific community should not hesitate to deploy existing crown delineation methods to aid in measuring and mapping HWA impacts.
Much of the northeast United States is still aggrading second growth forest (Thompson et al., 2013). However, while our plots cover a range of structure and composition, they are undoubtedly still just a sample of the different forest types found across the northeast. LiDARcrown delineation methods are likely to show varying degrees of accuracy based on additional factors influencing structure, such as stage of forest succession (van Ewijk et al., 2013).
Relatively young stands in stem-exclusion stage (sensu Oliver & Larson, 1996), are likely to be especially difficult to delineate because of high-stem density and intense competition, while mature-and old-growth stands may show opposite patterns. Given that stand structural complexity often increase with stand age (Bradford and Kastendick, 2010), with increased number of large trees (Lorimer and Frelich, 1998) and canopy surface complexity (Ogunjemiyo et al., 2005), I would expect mature-and old-growth stands to be delineated with higher accuracy.

Moving forward
LiDAR-based crown delineation methods have garnered substantial interest in recent years because of the ability to directly measure structural characteristics of tree crowns (Lindberg and Holmgren, 2017). However, these methods still struggle to delineate hardwood canopies. What many (deciduous) hardwood crowns lack in architectural distinctionthat many conifer crowns have that lend towards LiDAR-based delineationthey make up for in phenology and spectral distinction.
Indeed, much of the information I relied upon to manually delineate tree crownssubtle differences in hue and textureis lost in a LiDAR CHM model. Even more information may be available hyperspectral or multi-temporal RGB imagery (supplemental Figure 12). Many studies have shown great success for spectrally distinguishing canopy species using hyperspectral (e.g. Shi et al., 2018) and multi-temporal imagery (Fang et al., 2018), while fewer studies have made use of this wealth of information available to delineate mixed-and hardwood-dominated forests (Maschler et al., 2018;Yang et al., 2017).
Undoubtedly, there has been work to use high resolution imagery for crown delineation, and it was the focus during the genesis of this research topic (Lamar et al., 2005;Leckie et al., 2004). However, many of studies often relied on panchromatic  or single band imagery (Ke and Quackenbush, 2011). Despite the limitations of spectral methods (Dalponte et al., 2015), integration of spectral characteristics into crown delineation methods would likely improve the ability to differentiate neighboring crowns that would otherwise be under-segmented or group crowns that would otherwise be over-segmented. Future work should focus on developing spectral-or integrated LiDAR-spectral delineation methods. The widespread availability of spectral platformsincluding high-resolution spaceborne platformsadds incentive to develop effective methods because of the potential to apply methods broadly.

Conclusion
The ability to automatically delineate individual tree crowns in all types of forests would be a major step forward for remote sensing-based ecology. I found that crown delineation remains difficult in closed-canopy mixed species forests of the northeastern United States. While LiDAR-based methods work well in conifer dominated plots, they are somewhat less effective in hardwood dominated plots, which maylimit the applicability of these methods over broad spatial scales. Overall, discrepancies in accuracy appears to be driven by differences in underly traits controlling tree architecture and how trees interact with each other in close proximity. LiDAR methods work especially well in conifer dominated stands with distinct crown architecture.
While hardwoods often lack the same structural distinction, they have unique phenology that may be exploited to improved delineation techniques. My work points towards a need to develop crown delineation techniques that integrate both structural and spectral characteristics to effectively delineate mixed species stands. Figure 9: Relationships between the fraction of conifer crown area per plot (conifer fraction) and Shannon's Diversity Index (A), Rumple Index (B), Pielou's Evenness Index (C), trees per plot (D), and Aggregation Index (E).

Figure 10:
Following parameter tuning, plot-level accuracy varied similarly by methods across plot, indicating that accuracy is largely controlled by the structure and composition of the plots rather than methodological differences.    Figure 12: UAV imagery collected over the ForestGEO MegaPlot on September 13 th (A), October 12 th (B), October 22 th (C) and November 4 th (D). All these dates of imagery were used to during manual crown delineation interpretation. The images highlight differences in phenology that may be useful for future crown delineation work.

Figure 13
: Emergent white pine crowns stand out as hotspots on a canopy height model. Low density wood allows white pine to grow taller than all other species in the Harvard Forest. They can often stand five or more meters above the continuous canopy.

Figure 14:
Red pine often exhibit crown shyness when grown in monoculture. Panel A shows crown shyness from below the canopy, while Panel B shows it from above the canopy.

Figure 15:
Conifer and hardwood functional groups show distinct differences in foliar nitrogen and wood density (specific gravity) that influence overall tree architecture and how they interact with neighboring crowns. There is also considerable variation of traits within functional groups. Average foliar %N values taken from Northeastern Ecosystem Research Cooperative ( 2010). Average specific gravity values taken from Ducey and Knapp (2010).