Full-waveform Airborne Laser Scanning in Vegetation Studies—a Review of Point Cloud and Waveform Features for Tree Species Classification

In recent years, small-footprint full-waveform airborne laser scanning has become readily available and established for vegetation studies in the fields of forestry, agriculture and urban studies. Independent of the field of application and the derived final product, each study uses features to classify a target object and to assess its characteristics (e.g., tree species). These laser scanning features describe an observable characteristic of the returned laser signal (e.g., signal amplitude) or a quantity of an object (e.g., height-width ratio of the tree crown). In particular, studies dealing with tree species classification apply a variety of such features as input. However, an extensive overview, categorization and comparison of features from full-waveform airborne laser scanning and how they relate to specific tree species are still missing. This review identifies frequently used full-waveform airborne laser scanning-based point cloud and waveform features for tree species classification and compares the applied features and their characteristics for specific tree species detection. Furthermore, limiting and influencing factors on feature characteristics and tree classification are discussed with respect to vegetation structure, data acquisition and processing.


Introduction
In recent years, small-footprint full-waveform (FWF) airborne laser scanning (ALS) has become readily available and established for vegetation studies.This kind of data facilitates better sampling of the lower parts of tree crowns compared to data from discrete ALS and consists of the complete backscattered waveform information.Several geometric as well as radiometric features can be derived from these datasets.Geometric features are related to the XYZ coordinates of the point cloud and radiometric features refer to specific echo parameters that are extracted from the received waveform [1].Such enriched point clouds offer an added value to features from discrete ALS for vegetation and forest mapping and for individual tree identification and classification [2,3].Since 2004, FWF ALS systems are increasingly used for object detection and classification [4,5], and for the derivation of vegetation characteristics in forestry [6][7][8][9][10][11], agriculture [12], and in urban areas [13].Research efforts focusing on vegetation are expanding due to an increasing need to quantify vegetation characteristics and to model environmental dynamics [14].In particular, tree species information and the accurate identification of single trees is of great significance for sustainable forest management, ecosystem investigations and climate-vegetation relationships on the one hand, and for tree-related scientific research on the other [15].
The majority of vegetation studies with FWF ALS data input are conducted in the context of forestry research, aiming at a more accurate estimation of tree and forest stand parameters.
Apart from forestry applications, 3D mapping of vegetation has gained increasing interest in urban data management as well as in ecological studies for monitoring and inventory issues.The richer spatial and radiometric information about the volumetric structure of trees from FWF data and the involvement of taxonomic information can assist in the assessment of biodiversity [2,28,29], animal ecology [30] or in the detection of invasive species [31].Information on different tree species and their characteristics is also needed in urban environments [32], where both the presence and spatial variation of vegetation have a direct impact on radiation penetration and solar energy flux [33], evapotranspiration, microclimate and air circulation [34], and thus influence temperatures and air quality [2,35].Many practical applications such as noise mitigation, reduction of air pollution, energy management and management of recreation areas rely on detailed up-to-date data sources.However, capturing detailed and classified data is challenging in urban environments, as these are associated with a high structural complexity involving different object types and a variety of geometric shapes.In contrast to closed forest stands, urban trees are characterized by large species diversity within small areas and a complex (typically anthropogenic) shape [13].Here, FWF data can support the derivation of urban tree structure and species information [13,36].
Common to each study and application is the use of laser scanning features to capture the target object and its characteristics (e.g., tree species).Laser scanning features describe an observable characteristic of the returned laser signal (e.g., signal amplitude) or a quantity of an object (e.g., height-width ratio of the tree crown).Such features are available from the waveform and the point cloud directly or need to be derived by further processing (e.g., the backscatter cross-section by radiometric calibration [1]).Several studies and review papers on the use of FWF ALS data have been published and several of them deal with vegetation investigation and tree species classification, discussing the derivation and use of features for this purpose [37][38][39][40][41].However, an extensive overview, categorization and comparison of FWF ALS-based point cloud and waveform features for tree species classification and how these features relate to specific tree species is still missing.
This review aims to fill this gap by identifying and evaluating frequently used and indicative point cloud and waveform features for tree species classification, which are derived from decomposed and partly radiometrically calibrated FWF ALS data.The focus here is on features, which are derived from data of previously detected/segmented single tree objects.Three key questions are addressed: 1.
Which point cloud and waveform features have been used to classify trees into species classes? 2.
How accurate are the classification strategies based on the derived features, and what are their limitations?3.
Which point cloud and waveform features have emerged as indicative features for a specific tree species?
This paper reviews peer-reviewed journal papers that investigate FWF ALS for tree species classification with special emphasis on derived FWF features.In order to form a comprehensive review, publications of international conferences (such as ISPRS Archives/Annals and SilviLaser) with a focus on tree species classification using FWF ALS data are regarded as well.The paper is divided into the following three sections: (1) overview of the derived and applied features for tree species classification (Section 2); (2) discussion of factors that limit and influence feature characteristics and tree species specific feature characteristics (Section 3); and (3) final conclusion and tabular summary of point cloud and waveform features (Section 4).

Derived and Applied Features for Tree Species Classification
Studies dealing with FWF ALS data and tree classification in particular, have varying terminology to describe the observables of the returned laser signal.Their terminology differs with respect to naming and abbreviation of the observables.For example, the received signal amplitude is defined as an attribute of the data in one study, while in others it is defined as a feature or parameter.The same applies for the combination of the observables for classification, which is described as model, feature set or saliency.Furthermore, varying abbreviations such as A, S1 or A0 for the received signal amplitude are used.These varieties hamper the comparability of results and discussion.Therefore, the terms feature and feature set are used to describe the observables of the returned laser signal and their combination for classification.The abbreviations refer to frequently used abbreviations within the studies or to acronyms of the feature naming in case of diverse abbreviations (Tables A1 and A2).

Full-Waveform Data and Single Tree Classification
Information about the individual tree species and the subsequent estimation of species-specific derivatives is (among other parameters) required for forest management and is one of the biggest challenges in ALS-based inventories [42].In contrast to discrete return laser scanners, full-waveform laser scanners record the complete backscattered waveform [5,43].Instead of singular return locations, the FWF data reveal the detailed time-dependent distribution of the targets in the beam path and their physical backscattering properties of the transmitted laser pulse collected during its trip to the object surface and back [4,44].
Since the recorded waveform of highly vegetated areas is usually composed of several backscattered echoes, signal decomposition is performed in order to extract the different distinct targets along the path.To reconstruct the position, size, orientation and reflectance characteristics of scattering targets, [45] identify the following general tasks: (1) determination of the number of scatterers and (2) calculation of their distance from the scanner; (3) decomposition of waveform into single echoes; (4) calculation of additional echo parameters; and (5) radiometric calibration.
Most studies dealing with tree species classification perform the detection and extraction of single trees in a first step.Only laser scanning data associated with a detected tree are used for feature derivation and classification.The detection of single trees in general is conducted by considering the canopy height model [46][47][48] or the 3D point distribution within a defined neighborhood [49][50][51], as well as by adding pre-filtering steps and a hierarchical or a non-hierarchical rule base of further feature types.Most pre-filtering steps serve to exclude noisy data or to detect structure related characteristics.Commonly applied filters are the minimum height (H min ) threshold and the number and index of return [13,27,[52][53][54][55] as well as the features echo width [13,[56][57][58][59][60][61][62][63][64][65], backscatter cross-section and backscatter coefficient [13,56,58,64,65].
Based on these detected single tree objects, different feature types are derived and combined to feature sets, which are then used for tree species classification.The features derived from FWF data cover features related to the point cloud and features related to the waveform.Due to the limited discussion of species-specific feature behavior and of single feature performance in tree species classification within the majority of reviewed studies, a general reflection of used features, successfully applied for classification (Sections 2.2 and 2.3), is separated from a discussion of feature characteristics in relation to single tree species.The assignment of single features and their characteristic to a specific species is discussed in Section 3.2.

Derived Features for Tree Species Classification
The classification into tree species is achieved using laser scanning features.A laser scanning feature describes an observable characteristic of the returned laser signal.These features can be categorized into different types and the types can be assigned into two classes.A first distinction is made between the types geometric, radiometric and waveform features.Geometric features describe/represent the geometry of the tree object or of a part of the tree, e.g., a conical or rounded crown shape.Radiometric features describe/represent radiometric characteristics like the backscatter cross-section.While radiometric features are assigned to the single points of the point cloud, waveform features, in contrast, refer to the whole received waveform (e.g., the distance between two waveform echoes and the skewness of the waveform peaks).Secondly, these feature types can be assigned up to two classes, considering the different level of spatial correlation/aggregation.A distinction is made here between the class subset feature, which comprises/correlates a feature within a certain spatial subunit of the tree object (Table 1), and the class object feature, which comprises/correlates a feature with regard to the entire tree object (Table 2).The subunit is defined by a spatial neighborhood in 3D or 2D within the entire tree object (grid, raster, voxel, height layer bins and neighboring points).Subset and object features often reflect the statistics of a feature, like the mean echo amplitude of all echoes located within a single voxel cell and the mean height difference of the 10 nearest points of one point.In the following, the derived features of the studies are categorized by these feature types (geometric, radiometric and waveform) and assigned to the two classes (subset and object features).
Since the FWF systems measure the time-varying signal of the laser pulse, a theoretically unlimited number of returns per pulse and a higher point density can be achieved.Thus, the FWF ALS data may provide richer spatial information about the tree characteristics compared to discrete ALS data.Therefore, successfully derived and applied geometric features of discrete data are transferred to FWF data for tree species classification.For example, the number of echoes (TNo) within defined subsets of different height layers (TNo rast,stats,filter [25,66]) and voxels (TNo voxel,column [55]) are used for a classification.These density measures are often normalized by the total number of echoes of the tree object (P dens,bin,norm ) [52]) or are compared with other density subsets, e.g., the ratio of number of echoes within a given search radius in 3D and 2D (ER 3D/2D [26]).In addition, distance measures are derived for classification, like the mean horizontal distance of height layer points to the previously detected tree trunk (∆D trunk-dist,horiz ) [52,53,66].A different approach for considering the point neighborhood is chosen by [67].The authors use the Haralick's texture features (HT i ) [68], and features which depend on the L-function (L(t)) and Delaunay triangulation.The Haralick's texture features are calculated from the 3D gray level co-occurrence matrix which is based on the number of points per voxel in different XYZ directions.The L-function method is defined to derive K-function's deviations from its expected value in a circle (with radius t) and to derive features from the L(t) curve.One example is the number of echoes, which are determined by the number of local minimums per height layer (P dens,L-func_Npeak ).The Delaunay feature (P TIN,Edge ) defines the variance of edge lengths from triangulated points per height layer and their frequency distribution.
Besides considering geometric features, the statistics of waveform and radiometric features (e.g., echo amplitude and width, backscatter cross-section) are also calculated within a subunit [25,66].The echo width is dependent on the amount, distribution and orientation of scattering elements along the laser beam direction.The height variations of small scatterers like needles of coniferous trees tend to broaden the echo width and groups of small scatterers are not separable anymore in the echo waveform [57,69,70].Accordingly, a separation of different trees with varying structure can be expected by using the feature echo width.For instance, larch trees show higher averages in echo width compared to oaks (Quercus robur and Quercus petraea) and beeches (Fagus sylvatica) [47] and a higher standard deviation compared to spruce trees (Picea abies) [26] in particular in the upper crown layers (EW stats,h-layer ).

Radiometric
Point-assigned: Echo width EW, Fuzzy small membership of echo width FEW CV , Product of echo amplitude and width EAW c: [60] s: [25,66] Raster Statistics * 1 (and Additional Filters * 2 ): Amplitude A rast,stats,filter , s: [25,66] Height layer/bin/percentiles: filtered * 2 statistics * 1 of amplitude A perc,stats,filter s: [66] The object feature class comprises/correlates the features with regard to the entire tree object.The object features are calculated from individual detected echoes [24,52,53,73], the complete recorded waveforms [24,55] and from prior aggregated subsets like raster cells [25], voxels [55] and metric or percentage height layers [11].A summary of all regarded object features is shown in Table 2.
Geometric object features derived by fitting forms to the shape of point clouds are usually applied to prior detected crown echoes and forms like ellipsoids [73], parabolic surfaces [52,66,73], convex hull [49] and alpha shape [66,74].Derived features are the vertical length (Z obj,ellip , PS height ) and radius (PS radius ) of the fitted surface and subsequent function parameter (PS a , PS b ).Applied crown features for species classification are the ratio of crown length and tree height (CR lt ), the ratio of crown length and width (CR lw ) and the crown volume (CR vol ).Such geometric features were already applied for discrete echo data to classify pine, poplar and maple trees with an overall accuracy of 88.8% [75].In particular, maples are classified with a high producer's accuracy (95.9%).However, pine and poplar trees have similar CR lw .
Applied object features of radiometric and waveform features are the statistics of the number of returns of all waveforms (TNo wave,stats ), the amplitude (A stats,obj ), the echo width (EW stats,obj ), the backscatter cross-section (σ stats,obj ) and of the product of echo amplitude and width (EAW).The waveform-related object statistics RWE and H m,energy of the returned waveform energy are used to differentiate between pine, spruce, linden and fir [24] and between pine, spruce and birch [76].These two features are supplemented by the total pulse length for classifying Scots pine, Norway spruce and birch [27].The distance between two waveform echoes are calculated by time difference (∆T ij , in ns) [55] and by difference in range (∆R ij , in m) [76].The distances between waveform metrics such as the distance between waveform beginning and centroid to ground (∆WR ij ) are among other features applicable for classifying different coniferous and broadleaved species of subtropical forest [15].
In addition to statistical measures, features with regard to the spatial distribution of radiometric characteristics of the entire tree are used for classification.Frequently analyzed vertical distribution curves are the number of echoes per height layer (V H ) and the values of amplitude (V A ), echo width (V EW ) and backscatter cross-section (V σ ) over tree height [47].As an example, the needles of larch trees (Larix decidua) induce higher echo width values in the upper canopy in comparison to the foliage of oak trees (Quercus robur and Quercus petraea), which have relatively constant values over tree height [47].Vertical Profile * 1 : Echo width V EW c: [13] s: [47]

Applied Feature Sets for Tree Species Classification
The above discussed feature types and classes are usually combined for tree species classification (Table A4).The combined object statistics of (range corrected) amplitude, echo width, the product of echo amplitude and width and the number of echoes are suitable to classify Scots pine (Pinus sylvestris), Norway spruce (Picea abies), red oak (Quercus rubra) and European beech (Fagus sylvatica) of a German forest [25] and red pine (Pinus koraiensis), Koyama spruce (Picea koraiensis), Dahurian larch (Larix gmelinii), fir (Abies nephrolepis), white and ribbed birch (Betula platyphylla Suk. and Betula Costata), linden (Tilia Mandschurica) and Mongolian oak (Quercus Mongolica) of a forest in north-east China [24].Higher accuracies are achieved at tree class level at both sites (Germany: 91.7% vs. 80.4%, China: 85.7% vs. 55.1%), and at the Chinese forest site, the corrected amplitude showed a six percent higher accuracy for corrected than for uncorrected amplitude data.
The proposed feature set in the study of [66] is supplemented by statistics per height percentile of elevation values, A, EAW and EW and filtered by the index of return.The classification of spruce (Picea abies), beech (Fagus sylvatica), fir (Abies alba) and maple trees (Acer pseudoplatanus) achieves an overall accuracy of 95% (leaf-on) and 94% (leaf-off) and higher accuracy for the lower height layers in a leaf-off condition (95%) than in leaf-on (86%).The most important feature under both conditions is the mean EAW per tree.The mean echo width works better for leaf-off data.The percentiles of the number of ALS echoes and the percentage of echoes in a specific tree layer have little impact on the classification.
Waveform features proved to be appropriate for the classification of Scots pine (Pinus sylvestris) when distinguishing between Norway spruce (Picea abies) and birch (Betula sp.), achieving a producer's accuracy of 85.9% and an overall accuracy of 71.5% [76].Birch and spruce tend towards higher misclassification.The most important features are the range difference between echoes (∆R 1st/last ), the height of received waveform energy (H m,energy ) and the sum of peak amplitudes (A sum,obj ) per waveform.The combination of these waveform features with the statistics of peak amplitudes, the full-width-half-maximum, the total length of a pulse and the height at which 50% of the waveform energy is received classify the three tree species as well [27].Most important for classification is the energy of single returns.
A further approach to waveform feature derivation is taken by [55].They compute the Principal Components (PC) of Fourier variables, which result from discrete Fourier transformation of each waveform.By adding these features to geometric subset features, the overall classification improves by 6% up to 85.4%.In particular, the differentiation between Douglas fir (Pseudotsuga menziessi (Mirb.)Franco) and red cedar (Thuja plicata Donn ex D. Don) benefits from PC-Fourier features.The coniferous (Pinus massoniana Lamb.and Pinus elliottii Engelm., Cunninghamia lanceolate (Lamb.)Hook.) and broadleaved species (Quercus acutissima Carruth., Liquidambar formosana Hance, Ilex chinensis Sims.) at a subtropical forest site are classified by [15] by using only features of an aggregated distribution function (∆WR centroid,ground , ∆WR beginning,ground , ∆WR beginning,1stpeak , TNo wave,stats , FS, RWE).All waveforms belonging to a detected tree are incorporated into distribution functions (the authors call it 'composite waveforms') utilizing a voxelized tree space on the waveform parameters (peak value and peak width).These features achieve an overall accuracy of 86.2% at the class level and of 68.6% at the species level.The distance between the centroid of the distribution and the ground (∆WR centroid,ground ) is proven to be very useful for discriminating tree species.
For classifying Norway spruce (Picea abies (L.)), Scots pine (Pinus sylvestris (L.)) and birch (Betula pendula (L.) and Betula pubescens (L.)) in a Swedish forest, the statistics of amplitude, echo width and number of echoes per waveform are supplemented by geometric subset features from deduced ellipsoid fittings (Z obj_ellip , PS height , PS radius ) to the tree crown's point cloud [73].The combination of statistical measures and geometric features achieves an overall accuracy of 71% and an accuracy of 60% when using geometric features only.However, the ellipsoidal tree crown model fails for trees with understory and leads to erroneous fitted forms.The effectiveness of geometric subset features in regard to tree structure is tested by [67].They derive features from point distribution in vertical and horizontal dimensions within previously detected tree crowns.Four types are distinguished: (1) 3D texture features based on calculation of Haralick's texture features (HT i ) of voxels, which are averaged over varying directions in space; (2) relative degree of foliage clustering defined by gridded height layer number and its point pattern (e.g., P grid,VTMR ); (3) relative scale of foliage clustering calculated from the L-Function of the 2D density layer (e.g., P dens,L-func ); and (4) gap distribution in horizontal and vertical direction (e.g., edge statistics of calculated TIN, P TIN,Edge ).Those features achieve an overall accuracy of 82.2% at the class level and of 77.5% at the species level for classifying jack pine (Pinus banksiana Lamb.), eastern white pine (Pinus strobus L.), sugar maple (Acer saccharum Marsh.) and trembling aspen (Populus tremuloides Michx.).The major structural difference among the species occurs at the top crown layer rather than middle and low stem layers.For example, aspen trees tend to exhibit clumped crowns at the top layers compared to other species.These clumped crowns have higher variance-to-mean ratio (VTMR) of echo pattern and negative value of L(t) in comparison to the distributed foliage along the stem of maple trees (VTMR < 1, positive L(t)).

Influencing and Limiting Factors on Feature Characteristic and Tree Classification
The variety of features that can be derived from FWF ALS data enables the classification of tree species.However, the feature characteristics are affected by (1) factors related to vegetation structure (e.g., species stand age and forest stand structure); (2) technical factors related to the sensor settings, flight and data acquisition parameters (e.g., flying altitude and illuminated target size) and (3) factors related to data processing (e.g., waveform decomposition, crown delineation and data filtering) [27,77,78].

Factors related to Vegetation Structure
The crown shape differs according to tree species, local tree competition and growth age as well as treatment history [79,80].For example, Scots pine trees tend towards a conical shape when young and a rounded and irregular shape as they mature [81].Beeches, by contrast, respond rapidly to varying light conditions, which lead to a modified unstable relation between crown dimension and stem volume [15,66].Hovi et al. [27] point out that such tree effects explain the majority (>44%) of the within species variance of waveform energy, pulse length and peak amplitude.For example, suppressed pine trees tend to slender trunks and lower foliage mass and density, as they try to reach the dominant canopy and thus show a lower waveform energy and peak amplitude compared to dominant pine trees.Suppressed broadleaved trees grown below the general level of crown cover reveal a generally lower classification accuracy [15].In particular, the canopy openness and the vertical arrangement of canopy elements affect the penetration depth and thus the variation of waveforms within the tree crown.Tree crowns with densely packed woody materials and foliage reduce the number of pulses that can penetrate and reach the ground and increase ∆WR centroid,ground , whereas for more transparent crowns, more pulses can pass through and decrease ∆WR centroid,ground .This is also connected to the time of data acquisition, as no or only a few leaves enable a higher penetration rate and cause differences in amplitude and width of the waveform peaks.For instance, the simulation of scanning a broadleaf deciduous tree stand of Acer rubrum and Quercus rubra demonstrated much a larger amplitude and width of the peak for the leaf-on case and wavelength of 1064 nm [82].This vertical arrangement of scattering elements influences the backscattered echo width, and height variations of small scatterers tend to broaden the return pulse [70].

Technical Factors related to Data Acquisition and Processing
Furthermore, technical factors such as flying height, scanning angle as well as the waveform sampling and processing affect the usability of features and hamper the transferability of derived feature thresholds.
A low point density due to a high flying height above ground level and a low sampling rate results in a varying (geometric) representation of vertical profile features and classification accuracy for the same tree.The same feature derivation and classification step for decreased point density and increased voxel size, respectively, is applied by [67], and shows a positive linear correlation (R 2 = 0.88) between point density and classification accuracy.A minimum point density of 50 points per square meter is required to achieve a classification accuracy higher than 70%, but 5 points per square meter are enough to achieve an accuracy of 50% (original 90 points m-2 flown at 300 m above ground level with a Riegl LMS-Q560).Although the used vertical profile features are still effective at lower point densities, the authors state that higher densities are necessary to extract structural parameters.
Besides this point density effect, the range and beam divergence determine the size of the laser footprint area.The laser beam diverges from its nominal direction with range and creates a narrow conic shape.The transmitted energy is spread over the footprint area and its intensity decreases towards the edges of the beam.Within the footprint area, a variable amount of scatterers is encountered.These scatterers, mostly with different sizes and orientations in the case of vegetation, affect the power and shape of the backscattered waveform and consequently the derived features.
A further influencing factor is the derivation of normalized height values.The tree height and crown shape are often delineated by finding the local maxima and the connected points.The height normalization by subtracting the DTM from the non-ground elevation might be straightforward in flat areas, but in the presence of slope, this leads to a horizontal displacement of tree location and a local distortion of the point cloud geometry within the tree crown.Therefore, [81] and [83] accordingly recommend a normalization considering the slope.One possible approach is normalization after point-based segmentation for each detected tree [84].The tree's point cloud is normalized using the ground elevation under the highest point of the crown as a reference for the rest of the point cloud.
The representation of vertical profile features and vegetation metrics in forested areas is also influenced by the scan angle [15, 17,85].In particular, a difference in canopy penetration depth due to varying incidence angles has been observed in areas with a large variation in tree height.While larger incidence angles (>10 • ) provide more information about the lower parts of the canopy, narrower scan angles increase the likelihood of obtaining returns from the ground [86].With more detailed capturing of the lower layers, a precise derivation of the crown base becomes possible.This enhances the correctness of crown-related features and their applicability for subsequent classification.However, a more detailed representation of the lower layers increases the detection of understory as well.The understory included in the dataset may affect and distort the feature calculation, and thus further pre-processing is needed to remove it.The effect of the scan angle on waveform-based features (like ∆WR ij ) on classification accuracy is evaluated by [15], revealing no significant difference (~2%) in the overall accuracy when comparing the same features for waveforms of scan angles of <15 • and of 15-30 • .
When using features extracted from waveforms, several studies assume a perfectly nadir perspective of the laser beam and do not account for a slanted penetration of the laser beam in feature derivation.Subsequently, the information derived from the waveform is supposed to describe the tree crown vertically.However, off-nadir measurements and varying trajectories of overlapping strips may lead to uncertainties in spatiotemporal analyses due to the influence of the scan angle on the recorded signal strength.The signal strength and the receiver's sampling rate and sensitivity induce relative shape differences in the waveforms, especially in fast rising 4 ns pulses [27].To overcome these effects, a waveform normalization step is needed, comparable to radiometric calibration.The few studies dealing with such normalization (prior classification) combine all waveforms to distribution functions (DF).The pseudo-vertical waveform, as [87] call them, can result from the computation of cumulative waveforms from the sum of all signals falling within a tree crown boundary [22] or the voxel-based derivation and calculation of the mean and maximum waveform amplitude in each voxel [15, [87][88][89].Hermosilla et al. [87] state that such waveform distribution functions provide denser and more continuous coverage and enhance the characterization of vegetation.Thus, more crown structure information is retrieved and a more consistent response at adjacent height levels is provided.However, the voxel size has to be carefully selected and should consider the laser footprint size, crown diameter, and average ground spacing of pulses [15].If the horizontal voxel size is lower than the laser footprint, many incomplete DF of waveforms are produced and the number of newly formed DFs is much higher than the directly observed raw waveforms.In contrast, a larger voxel size tends to produce over-synthesized DFs and consequently, a loss of detailed geometric and radiometric information.
The occurrence of multiple echoes and the number of echoes in general depends on certain sensor settings (e.g., beam width and waveform sampling interval) and waveform processing [90].A larger beam width and a higher waveform sampling interval increase the chance of multiple echoes.Further, a more sensitive waveform decomposition algorithm may lead to a higher number of echoes [5].However, single objects (e.g., branches and leaves) can only be detected as separate objects if their distance is larger than the half of the pulse length.For example, this distance is 0.6 m for a pulse length of 4 ns (recorded by the Riegl LMS-Q680).This is also connected to the echo width and subsequent feature derivation and threshold transferability.Values have to be considered carefully, since the width depends on the strength of the received pulse [60,65].To tackle this problem, [60] and [63] apply a normalization procedure in an urban environment, receiving an overall classification accuracy (>93%) and correctness (93%) for trees.Lin [60] uses the concept of Fuzzy Small membership functions in a defined neighborhood and [63] normalize the echo width by its minimum EW and maximum EW, which is found by using quantiles of the distribution of echo width.Such approaches improve the transfer of thresholds of normalized echo width from region A to region B.
Vegetation is usually represented by heterogeneous targets within the illuminated footprint with diverse geometric representation (e.g., area, orientation) and reflectance properties (e.g., leaves or needles, thin and thick branches and tree trunk).The reflectance properties of the scattering elements differ between the commonly applied wavelengths of 1064 nm and 1550 nm and influence the intensities of the returned energy [27,91,92].For example, the returned intensities from leaves are similar to trunks/branches at 1064 nm and lower at 1550 nm, where leaf scattering is strongly attenuated by liquid water absorption.Using the backscattered signal strength for purposes of detection and classification requires a radiometric calibration, since it depends on various factors such as emitted pulse, range, incidence angle and atmospheric conditions.Hence, the recorded amplitude values would vary for a given target, even within a single dataset, and require the values to be converted to absolute physical quantities especially for multi-temporal analysis and for data from different ALS systems.The physical principles of radiometric calibration of FWF ALS data are discussed in several studies [1,93].The resulting physical quantity is the backscatter cross-section σ, which represents the characteristics of the reflecting target accounting for its size, reflectivity and directionality of scattering [43].
The backscatter cross-section constitutes a physical property and is independent from the transmitted laser pulse.It has shown good performance for vegetation classification [13,56,65].Wagner et al. [65] recognize a variety of σ within forest canopies and identify an influence of tree species and tree structure.Furthermore, σ is affected by the beam width, the flying altitude and the resultant variation in energy of the returned signal and area of the laser beam footprint on the target.For example, [65] reveal a σ of <0.08 m 2 for vegetation (flying altitude: 500 m, footprint: 0.25 m) whereas [64] reveal a σ of >0.6 m 2 (flying altitude: 1900 m, footprint: 0.95 m).This makes it difficult to compare σ of targets with similar scattering properties.Therefore, the use of γ, defined as the normalized measure of σ regardless of the area of the footprint, is advantageous [56].In the case of vegetation, where the target surface, e.g., leaves, is smaller than the laser footprint, the actual illuminated target area is unknown.However, γ depends on the number of echoes per laser shot from the trees [56,58].For multiple returns, the γ is smaller than for single returns.Echoes with a single peak might represent extended targets and their parameters can be directly related to the target's radiometric properties.
The selection and implementation of methods for feature derivation also have an impact on feature characteristics.Resulting errors in tree segmentation may increase the likelihood of outliers, which influence the feature statistics (e.g., A mean per tree).Hovi et al. [27] compare the features derived by watershed segmentation (WS) and manual tree segmentation (MS) and analyze the impact on classification.They show that echo width has a higher weighting for MS-based classification and lower weighting for WS-based classification.A prior detection and removal of outliers to derive a more robust classification is recommended [54].

Tree Species Specific Feature Characteristics
Full-waveform airborne laser scanning provides features for tree species classification at varying taxonomic levels.Depending on the species composition (coniferous and deciduous trees) and period of data acquisition (leaf-on and leaf-off), the applied features for species classification vary.Some features are better suited for certain species than for others.Table 3 summarizes these feature characteristics for the major tree species within the different studies.
For the differentiation between coniferous and deciduous trees, features related to the different crown shape and structure have to be used.The coniferous trees of the reviewed studies show a higher canopy density (TNo obj , H min ) [26] and a higher ratio of single reflections (ER single/multiple,obj ) [66] under leaf-off condition and more canopy gaps (P TIN,Edge ) [67] under leaf-on condition compared to deciduous trees.Furthermore, coniferous trees tend to display a higher backscatter cross-section [47] and the usage of amplitude features is more suitable under leaf-off condition than under leaf-on [52,53,94].Generally, higher accuracies have been achieved under leaf-off condition when classifying at class level (coniferous and deciduous) compared to leaf-on data.
Among tree species, major structural differences occur at the top crown layers and at the bases of the living crown.In particular, features related to penetration rate and biophysical properties of the top layers are advantageous for species classification.Pine trees show a lower mean amplitude and a lower mean echo width than the deciduous trees, as well as a lower number of echoes for each crown under leaf-on condition [27,73].The lower amplitude may be due to the spectral and geometrical properties, meaning that pine trees reflect less light and have less tree crown density with fewer main branches [73].Furthermore, pine trees have a higher proportion of single returns at the 90th height percentile, as shown by studies on discrete ALS data [75, 95,96].This difference in the spatial distribution enables the differentiation between pine and spruce [73].Spruce trees show narrower echo widths and a lower standard deviation of echo width [26,27], which could be due to a dense surface of needles and rather horizontal branches, and the low height variation of the scatterers within the laser footprint.During leaf-off season, the surface of a larch is more heterogeneous than that of a spruce and therefore the height variation of the scatterers within the laser footprint is higher.The upper crown parts of larch trees have higher values of echo width and backscatter cross-section [26].
The features for deciduous tree species differ through the phenological cycle.Radiometric features of needles and leaves are less different in leaf-on than needles and bark in the leaf-off season [52].The number of echoes (TNo) in the lower height layers is higher under leaf-off than under leaf-on conditions.The ratio of single and multiple echoes (ER single/multiple,obj ) is the best feature for classification of beech and maple under leaf-off.The crown density varies between deciduous species in the same season and allows a classification by waveform-based features.Vaughn et al. [55] classify bigleaf maple by the small inter-peak distance (∆R ij ) and differentiate red alder and black cottonwood with an 80% accuracy by relative height percentiles.A structural difference of maple, white pine, jack pine and aspen is recognizable at the 14th height layer (≈crown base) [67], where maple and white pine have more leaves and branches distributed randomly (positive L(t)) than aspen and jack pine (negative L(t)) in leaf-on season.Furthermore, growing understory is more often detected under maples trees than under coniferous trees, which makes the detection of the crown base more difficult.Comparing the penetration features, poplar trees have a higher number of last returns closer to ground level in foliated season.
The waveform and radiometric features have the strength to classify oak and beech species in leaf-off season [47].Both species belong to the family of Fagaceae and show relatively constant average values for echo width (V EW ) and decreasing backscatter cross-section (V σ ) over the height profile.Small oaks have a lower average value in backscatter cross-section compared to taller trees in the same area.The sparse and evenly distributed foliage of birch crowns in leaf-on season induces widened returns with lower peak amplitudes [27].
Table 3. Summary of species-specific feature characteristics of the major tree species.

Taxonomy (Tree Class and Species) Feature Characteristic
Coniferous trees (vs.deciduous trees) higher canopy density (TNo obj , H min ) [26] higher ratio of single reflections (ER single/multiple,obj ) under leaf-off [66] more canopy gaps (P TIN,Edge ) under leaf-on condition [67] tend to display higher backscatter cross-section (σ) [47] usage of amplitude features (A) is more suitable under leaf-off condition than under leaf-on [52,53,94] broadening of echo width (EW) [57,69,70] Larch higher values of echo width at upper crown parts (V EW ) [26,47] higher backscatter cross-section at upper crown parts (V σ ) [26] Pine (vs.deciduous trees) lower mean amplitude (A mean,obj ) and lower mean echo width (EW mean,obj ) [27,73] lower number of echoes (TNo) for each crown under leaf-on condition [27,73] Spruce narrower echo width and lower standard deviation of echo width [26,27] Deciduous trees features differ through the phenological cycle → radiometric features of needles and leaves tend to be less different in leaf-on than needles and bark in leaf-off season [52] Aspen negative L(t) in leaf-on season [67] higher variance-to-mean-ratio (VTMR) of echo pattern [67] Beech relatively constant values of average echo width (of bins) over the height profile [47] decreasing backscatter cross-section over the height profile (V σ ) [47] Birch widened echo width (EW) [27] lower peak amplitude [27] Maple bigleaf maple: small inter-peak distance (∆R ij ) [55] positive L(t) in leaf-on season [67] variance-to-mean-ratio (VTMR) lower than one of echo pattern [67] Oak relatively constant values of average echo width over the height profile (V EW ) [47] decreasing backscatter cross-section over the height profile (V σ ) [47] small oaks, lower average in backscatter cross-section (σ stats,obj )compared to taller trees [47] Poplar higher number of last returns (TNo last ) closer to ground level in leaf-on season

Conclusions
This review gives a detailed overview of derived point cloud and waveform features of small footprint full-waveform airborne laser scanning data for tree species classification.The advantage of full-waveform features compared to features derived from discrete returns is due to the available waveform information that can be either accessed directly or derived by further processing.Furthermore, a higher point density and a theoretically unlimited number of returns that are recorded for each emitted pulse are present.The reviewed literature clearly shows that these FWF features, which reflect the geometric and radiometric characteristics of single trees, have the potential to improve tree species classification accuracies.Moreover, geometric feature types, already successfully applied to discrete data, have been transferred to FWF point cloud data.All subset and object features used for classifications are shown in Tables 1 and 2. Main species-specific feature characteristics are summarized in Table 3.
For the classification of trees, features related to the backscatter signal strength at the upper crown parts (e.g., echo width at 90th height percentile) and the vertical distribution of scatterers (e.g., ratio of crown length and width CR lw ) and its backscatter characteristics (e.g., echo width V EW ) are the most important.The classification at the tree species level needs varying feature compositions for classification, depending on the underlying species (coniferous and deciduous trees) and the time of data acquisition (leaf-on and leaf-off).For the differentiation between coniferous and deciduous trees, features related to different crown structures, like the proportion of echoes (ER 3D/2D , ER single/multiple,obj ), are used for classification.Coniferous trees tend to a higher ratio of single reflections compared to deciduous trees under leaf-off condition, and to a broadening of the echo width (EW stats,h-layer ) particularly for upper crown parts.The usage of amplitude features is more suitable under leaf-off than under leaf-on condition for the classification, where amplitudes of needles differ more significantly from the amplitudes of bark than from leaves.The different tree species can be better distinguished from each other by subset features of waveform and radiometric features and their comparison over the vertical profile.The vertical echo width profile (V EW ) of beeches shows relatively constant values, whereas its backscatter cross-section (V σ ) decreases with increasing height.Conversely, larch trees tend to higher values of echo width and backscatter cross-section at upper crown parts.Waveform features are sensitive to the canopy openness and the vertical arrangement of canopy elements and thus indicative for different species.The larger leaves of bigleaf maple trees cause more distinctive and noticeable peaks in the return signal, and shows a smaller inter-peak distance (∆R ij /∆T ij ) unlike other hardwood species.
Higher overall classification accuracies have been achieved at the class level (>80%, up to ~97%) compared to a classification at the species level (>70%, up to ~88%), whereas the producer's accuracy of different species ranges from low to high (>90%) levels depending on the species composition.However, a detailed discussion of species-specific feature behavior is limited within the studies.Furthermore, most of the research to date related to tree species classification by FWF ALS data has been conducted in temperate and boreal forests, whereas only a few number of studies from (sub-) tropical and urban sites are published.In particular the multitude of different trees (species and shape) densely growing in forested (sub-) tropical area affects the applicability of distinct features for species classification and requires a closer examination.In addition, the feature characteristics are affected by factors related to the vegetation structure and by technical factors related to data acquisition and data processing, which hamper the transferability of species-specific feature thresholds.Scots pine trees for example, tend towards a conical shape when young and a rounded and irregular shape as they mature.Beeches, by contrast, respond rapidly to varying light conditions, which lead to a modified unstable relation between crown dimension and stem volume.Technical factors like scanning range and beam divergence determine the size of the laser footprint area.Within the footprint area, a variable amount of scatterers is encountered.These scatterers, mostly with different sizes and orientations in the case of vegetation, affect the power and shape of the backscattered waveform and consequently the derived features.Such factors need to be taken into consideration when deriving features for classification and when transferring a feature threshold from one study area to another.
Investigations on species-specific feature characteristics and a detailed analysis of the influence of technical factors, methods for tree detection and varying growing conditions on the feature characteristics should be a focus of future work.Furthermore, the transferability of already successfully applied features from discrete ALS to FWF ALS data is to be investigated in more detail in further studies.Such comprehensive feature analysis and collection could enable the provision of a (tree) signature database of FWF ALS data.Such a signature database would simplify the feature derivation and the transferability as well as the understanding of species-specific behavior.
The number of echoes normalized by the total number of echoes of the tree object at given height layer.

P dens,L-func
The L-function features of echo number.

TNo rast,stats,filter
The number of echoes within a defined height layer based on raster-based calculations.For example TNo rast,mean,single as the average of the number of single echoes of all raster cells at a defined height.

TNo voxel,column
The number of echoes per voxel is related to the number of echoes of all subjacent voxels.

TNo wave,stats
The statistics of the number of echoes of all waveforms.V A Vertical profile of amplitude values.

V EW
Vertical profile of echo width values.

V H
Vertical profile of number of echoes.V σ Vertical profile of backscatter cross-section values.

V *_derivative
Derivative of the vertical profile of a feature, e.g., skewness of V EW (V EW_skewness ).Z obj,ellip (m) Vertical length of ellipsoid fitted to tree crown.∆D trunk-dist,horiz (m) The mean horizontal distance of an echo to the previously detected tree trunk.∆R ij (m) Distance between two waveform echoes i and j calculated by difference in range, e.g., the distance between the first and the last echo in meter (∆R 1st/last ).∆T ij (ns) Distance between two waveform echoes i and j calculated by time difference, e.g., the distance between the first and the last echo in nanoseconds (∆T 1st/last ).

Table 1 .Planarity: Plane residuals 1 ,
Subset features for classification differentiated by geometric and radiometric feature types and related to studies at taxonomic class and species level.Elevation difference ∆H ij between echo i and reference echo j, Elevation variance of all echoes ∆H var , Elevation difference between highest and lowest elevation value; Point density P dens , Penetration Index PI, Echo Ratio ER 3D/2D (slope adaptive) and ER ME (echo index) c: [23,54,56,60,61,71,72] Deviation of local normal vector η Z , Structure tensor planarity T P and omnivariance T O c: [54,60,61,72] Height layer/bin/percentiles: Average echo number N avg,bin , Maximum echo number deviation from the average echo number N nb,bin , Point density P dens,bin , Graph features of connected height layers (e.g., top distance TD), Variance-to-mean-ratio of number (VTMR) of echoes of gridded height layer P grid,VTMR , Mean trunk distance ∆D trunk dist, horiz., Statistics * 1 of TIN edges P TIN,Edge , filtered * 2 statistics * 1 of height H perc,stats,filter , L-function features of echo number P dens,L-func * c: [50,71] s: [52,55,66,67,73] Raster Statistics * 1 (and Additional Filters * 2 ): Number of echoes TNo rast,stats,filter s: [25,66]Voxel: Echo number voxel column ratio TNo voxel,column , echo number voxel area ratio ∆Area voxel,ch , Haralick's texture features of echo number voxels HT i s:[55,67] P dens,L-func_Npeak Number of echoes, that are determined by the number of local minimums per height layer of a L-function.P grid,VTMR Variance-to-mean-ratio of number of echoes of gridded height layer.PS a , PS b Function parameter of the parabolic surface fitted to the tree crown.PS height (m)(%) Vertical length of the parabolic surface fitted to the tree crown.PS radius (m) Radius of the parabolic surface fitted to the tree crown.P TIN,Edge (m) Variance of edge lengths from Delaunay triangulated points per height layer and their frequency distribution.RWE (DN) Total returned waveform energy.TNo Total number of echoes.TNo obj,Hmin Number of points above a defined height threshold.

Table 2 .
Object features for classification differentiated by geometric, waveform and radiometric feature types and related to studies at taxonomic class and species level.
Point distribution: Echo Ratio ER 3D/2D (slope adaptive) and ER ME (echo index), Height of center of gravity H gravity , Percentage of laser echoes above step-off count TNo Perc,Hmin,obj , Sphere-based minimum projection area Area proj , Crown related distributions (e.g., CR pdens,derivatives ) c: [13,59] Vertical profile: Height values V H,stats s: [47] Shape: Ellipsoid features (e.g., height Z obj_ellip ), Parabolic features (e.g., radius PS radius ), Crown features (e.g., crown length-tree height-ratio CR lt ) s: [28,66,73] Waveform Statistics * 1 : Sum of waveform amplitude A sum,obj , Returned waveform energy RWE and the height value H m,energy of e.g., 50% of the total energy, Number of echoes per waveform TNo wave,stats and between peak indices ∆No ij , distance between peak indices in range ∆R ij and time ∆T ij , distance between waveform metrics in range ∆WR ij and time ∆WT ij , Overlap width of first and second echo ∆OW 1st/2nd,stats , Front slope angle from waveform beginning to first peak FS, Product of echo amplitude and width EAW stats,obj , Echo width EW stats, obj, Shape parameter of Gaussian decomposition α decomp,wave (e.g., skewness) c: [15,64] s: [15,55,66,73,76]

Table A2 .
The following feature abbreviations are used in this manuscript.The statistics of amplitude A, e.g., mean A of tree object (A mean,obj ).Crown volume derivatices, e.g., CR vol in relation to crown length (CR vol,l ), width (CR vol,w ) or to tree height (CR vol,t ).EAW (DN)The product of echo amplitude and width, e.g., mean EAW of tree object (EAW mean,obj ).ER 3D/2DEcho ratio.The number of points in 3D in a fixed search distance is related to the number of points in 2D found in the same distance in 2D.ER single/multiple,obj Ratio of the number of single echoes to the number of multiple echoes.EW stats,obj (ns) The statistics of echo-width, e.g., mean EW of tree object (EW mean,obj ).EW stats,h-layer The statistics of echo width of a height layer, e.g., mean EW of the upper 2 m (EW mean,u2m ).FS ( • ) Front slope angle from waveform beginning to first peak.H m,energy (m)(%) Height, at which a specific amount of energy is reached, e.g., 50% of the returned energy.H min (m)(%) Height threshold as the minimum height.HT i Haralick's texture features calculated from 3D grey level co-occurrence matrix based on number of points per voxel in different directions in the 3D space.P dens,bin,norm