Modeling small-footprint airborne lidar-derived estimates of gap probability and leaf area index

: Airborne lidar point clouds of vegetation capture the 3-D distribution of its scattering elements, including leaves, branches, and ground features. Assessing the contribution from vegetation to the lidar point clouds requires an understanding of the physical interactions between the emitted laser pulses and their targets. Most of the current methods to estimate the gap probability ( P gap ) or leaf area index (LAI) from small-footprint airborne laser scan (ALS) point clouds rely on either point-number-based (PNB) or intensity-based (IB) approaches, with additional empirical correlations with ﬁeld measurements. However, site-speciﬁc parameterizations can limit the application of certain methods to other landscapes. The universality evaluation of these methods requires a physically based radiative transfer model that accounts for various lidar instrument speciﬁcations and environmental conditions. We conducted an extensive study to compare these approaches for various 3-D forest scenes using a point-cloud simulator developed for the latest version of the discrete anisotropic radiative transfer (DART) model. We investigated a range of variables for possible lidar point intensity, including radiometric quantities derived from Gaussian Decomposition (GD), such as the peak amplitude, standard deviation, integral of Gaussian proﬁles, and reﬂectance. The results disclosed that the PNB methods fail to capture the exact P gap as footprint size increases. By contrast, we veriﬁed that physical methods using lidar point intensity deﬁned by either the distance-weighted integral of Gaussian proﬁles or reﬂectance can estimate P gap and LAI with higher accuracy and reliability. Additionally, the removal of certain additional empirical correlation coe ﬃ cients is feasible. Routine use of small-footprint point-cloud radiometric measures to estimate P gap and the LAI potentially conﬁrms a departure from previous empirical studies, but this depends on additional parameters from lidar instrument vendors.


Introduction
Lidar remote sensing encompasses a broad range of technologies and applications [1]. Most remote sensing lidar devices use the time-of-flight technique to generate precise range measurements based on the reflected signals of outgoing laser pulses. Lidar systems are typically categorized according to either the platform, with its corresponding laser footprint diameter (i.e., large-footprint: >25 m [2,3]; mid-footprint: 3-25 m [4,5]; small-footprint: 10 cm-3 m [6][7][8]; terrestrial laser scan (TLS): 0.1-3 cm [9]), or the detector system for recording return energy (e.g., waveform, discrete returns, single-photon detection, etc.). Among the current lidar technologies, waveform lidar systems store the most comprehensive information, and the recorded waveform can be converted into discrete points by deconvolution, thresholding, zero-crossings, or Gaussian Decomposition (GD) [10].
In recent years, full-waveform lidars have been explored in depth to estimate gap fraction, leaf area index (LAI) profiles, and biomass [mid-to-large footprint [11][12][13][14][15], small-to-mid footprint [16,17], and TLS [18,19]. Most of these physically based approaches are capable of accurate estimation of gap probability (P gap ) and effective LAI (eLAI = ω · LAI, where ω is the clumping index [20]) without calibration with field measurements. However, conversion from P gap into true LAI in most of these approaches requires a generalization of ω which was derived from globally or regionally averaged satellite products [21,22] instead of from the lidar data itself. Indeed, ω is a complex parameter that accounts for clumping induced by landscape (e.g., spatial distribution of the canopy), canopy (e.g., within-crown leaf distribution), and shoots [20]. Clumping of shoots is beyond the scope of this study and therefore not studied. For small-footprint airborne laser scan (ALS), Hu, et al. [23] computed the geometrical path length distribution within the crown to consider the clumping induced by crown shape and landscape. However, within-crown leaf area density variation was not completely addressed.
Despite the comprehensive information contained in a lidar waveform, storage of lidar data as discrete points provides an efficient alternative. Discretization of the waveform can generate uncertainties [24], and it is not necessarily straightforward to attribute precise radiometric information to points, as "intensity" values are typically assigned by proprietary onboard processing systems that usually present a black box to lidar users. Understanding the algorithms that convert waveforms into points helps to avoid misuse of ambiguous definitions of radiometric quantities that can give erroneous estimations that violate physical principles. Information and accuracy losses during waveform discretization can also undermine the use of lidar points for biophysical parameter retrieval [25,26]. Given the influence of ambiguous coefficients and residual radiometric issues, there is considerable controversy over the point-cloud inversion methods used to estimate P gap from a computed laser penetration index (LPI), and in a further step the effective and the true Plant/Leaf Area Index (PAI/LAI) of vegetation [27]. For example, the accuracy of TLS inversions is affected by partial hits that depend on the dimensions of the laser beam and leaves [18,28,29]. Moreover, for ALS, the point density within an area or volume is usually used to estimate the LAI [30][31][32][33][34] or the leaf area density [35]. However, as mentioned by Armston, et al. [16] and Chen, et al. [17], methods based on point density provide estimations of P gap that rely on additional empirical correlation coefficients with field measurements. Ambiguous "intensity" information from lidar points has also been used to retrieve the LAI [36]. Most point-cloud inversion approaches lack a theoretical underpinning that would enable universal application under various instrument specifications and environmental conditions. The direct computation of LPI and ln LPI −1 (canopy interception described by the Beer-Lambert law by considering LPI as P gap ) from these approaches differ markedly. Furthermore, most previous empirical studies focused mainly on individual test sites (restricted environmental conditions) with a specific lidar device (fixed instrument) and a few airborne flights (limited experimental configurations). A general lack of rigorous sensitivity studies could raise questions regarding the accuracy and limitations of each approach: e.g., can any of the current methods be applied universally?
The methods developed for full-waveform data could be applied to point data if the radiometric information could be retrieved and understood for each point. The GD technique was developed to describe the waveform as a combination of discrete Gaussian profiles defined by peak amplitude, time shift, and standard deviation [37]. These quantities can be stored in point data with or without storage of the waveform [38][39][40][41][42][43]. Although these idealized quantities and the full waveform radiometric calibration [44,45] have been discussed previously, the physical interpretations under realistic conditions Remote Sens. 2020, 12, 4 3 of 31 have not been completely explored, such as the influences of footprint size, target size and orientation, terrain, clumped leaves, etc.
A robust radiometric linkage is required for various quantities that are involved in point-data generation and processing to understand the lidar point data and the inversion methods. Advances in lidar systems have been mirrored by development of lidar modules in radiative transfer models (RTM) [46][47][48][49], which can accurately simulate laser-target interactions under various instrument specifications (e.g., beam width, divergence and acquisition rate), acquisition and environmental conditions (e.g., flight altitude, viewing angle range, and terrain slope), and vegetation structures (e.g., leaf reflectance, leaf angle distributions, and canopy closure). The 3-D RTM framework is suitable for evaluating the sensitivity of different inversion approaches. It is capable of simulating laser pulse interactions with realistic 3-D scenes and recording the energy profile of the waveform. In recent years, many lidar waveform RTMs have been developed by extending the existing credible models, including FLIGHT [47], DIRSIG [50,51], RAYTRAN [52,53], LIBRAT [54], FLiES [55], and DART [48,56]. These RTMs have shown their capabilities in modeling lidar waveforms in the Radiation transfer Model Intercomparison (RAMI) project [57]. The DART waveform lidar module has been extended to adapt multi-pulse mode [58], which can efficiently simulate ALS and TLS waveform data in industrial lidar format, with recent bounding volume hierarchy (BVH) implementation [59] for ray tracing acceleration. Also, DART has the capability for solar signal and photon counting simulation [58]. These make DART a suitable tool for the sensitivity study and evaluation of different approaches.
This work takes advantage of the latest implementation in DART in physical modeling of point clouds from the existing multi-pulse waveform module, to conduct an in-depth study and intercomparisons of the existing gap probability and leaf area index inversion approaches of small-footprint ALS ( Table 2). In Section 2, we have reviewed and built the physical basis and the modeling approaches of various radiometric quantities associated with lidar point "intensity" for investigating various laser-target interactions in radiative transfer models. Additionally, we have described the implementation in the latest DART release to convert the simulated ALS and TLS waveform data into points. Building on the theoretical basis, in Section 3, we have conducted a comprehensive review with physical interpretation about the existing P gap and LAI estimation approaches from either point-number-based (PNB) methods or intensity-based (IB) methods. In Section 4, we have demonstrated the utility of the DART model in evaluating these approaches from sensitivity analyses with different footprint sizes, leaf areas, leaf density variations, foliar dimensions, and homogeneous/heterogeneous scene configurations. From the results of the sensitivity studies, we have discussed the optimization of LAI estimation through small-footprint ALS, which could leverage existing and future point datasets to develop more robust LAI map products.

Theoretical Background and Implementations in DART
The radiometric theories and point-cloud modeling in DART are presented in this section. Detailed scientific and technical aspects of the waveform lidar module are described by Gastellu-Etchegorry, et al. [48] and Yin, et al. [58].

Lidar Pulse
DART has improved lidar modeling in realistic acquisition geometry and power distribution. For an emitted laser pulse, two Gaussian profiles are defined: the temporal convolution of the transmitted pulse and receiver response function, S(t) =Ŝe − t 2 2s 2 s ; and the 2-D power profile P l (β) within the footprint cone such that the ratio of P l (β t ) at the boundary (half-divergence β t ) to the central maximumP l,β can be 0.5 (i.e., full width at half maximum (FWHM)), 1/e 2 , or 1/e [60]. The distribution follows: P l (β) =P l,β e − β 2 2s 2 β , where β represents the angular offset from the pulse direction and s β is the standard deviation of the angular divergence. The reception is defined with telescope area A t = The power spread within the footprint area (radius r fp = R · tan β t + d l 2 ) is precisely defined, where R is the sensor-to-target range and d l is the beam cross-section diameter at the "exit gate" of the laser generator. d l is negligible for aircraft and spaceborne platforms, but it has a critical influence on the TLS footprint dimension.

Lidar Point Cloud and the Corresponding Radiometric Quantities
GD is used to extract points from simulated waveform data. According to Wagner, et al. [37], in the presence of X echoes, the temporal power profile recorded by the receiver P r (t) can be approximated as: For the i th Gaussian profile, t i is the temporal centroid; s p,i = s 2 s + s 2 i where s i compensates for the broadening effect from an oblique surface or a cluster of leaves that cannot be distinguished in a single return, and s s is an instrument-specific constant unless the temporal profiles of every transmitted pulse are known [61]. s p,i ≈ s s if the footprint size is negligible compared to the pulse duration distance. The ability of GD to distinguish different targets relies on target size, surface angle, gap size, and acquisition rate.P i is the time-gated peak amplitude of the waveformP i = , where C cal is a calibration coefficient with pre-defined instrumental and experimental configurations, and detector efficiency that could be related to distance. Wagner, et al. [37] considered the differential backscatter cross-section σ = 4π Ω · ρ · A s of an ideal target with an effective receiving area A s , a natural reflectivity ρ, and a solid angleΩ in which all reflected fluxes are assumed to be uniformly distributed. However, for an actual multi-return lidar pulse with only the target distance known and no details of the interaction, partial hits can occur and both ρ and A s are unknown. Therefore, for each return, we refine this expression as: where A s,i = is the footprint area at a distance R i (d l is neglected). Here, we define the apparent reflectance ρ a,i as the ratio of reflected radiant flux from a surface, over a perpendicular infinite Lambertian surface with a reflectivity of 1, which can be considered similar to the biconical reflectance factor with the same incident and view direction [62] associated with every return to a pulse. By contrast, from the isotropic intensity given by Wagner, et al. [37], we considered isotropic scattered radiance of a Lambertian surface to define ρ a,i (intensity follows Lambert's cosine law). Thus, Ω = 2π cos θdΩ = π; and For each point output, the subjected results have five types of radiometric quantities (cf. Appendix A):

4.
Power integral I i of a return (after GD): Remote Sens. 2020, 12, 4 5 of 31 5. Apparent reflectance ρ a,i (after GD): where I = I · R 2 is the distance-weighted power integral. Indeed, ρ a,i is physically equivalent to the lidar backscattering coefficient, which was suggested for waveform radiometric calibration due to the normalization relative to the footprint area [44,64].
By contrast, the biconical reflectance factor or apparent reflectance of an ALS return can be defined from radiation transfer theory: where is the solid angle of return i to the receiver; P t = √ 2πŜs s is the total pulse power, and η is the system and atmospheric transmission factor. For various lidar devices, digital numbers that are linked to one ofP i , I i , or I i could be named "intensity" in the output. In contrast to I i and I i , P i cannot be used to directly derive ρ a,i without knowing s p,i of the Gaussian profile (unless s s s i ), which necessitates the recording [50] or appropriate processing [44] of the complete waveform. Note that some discrete-return lidar devices provide "reflectance" output by calibrating "intensity" values with reference targets (reflectivity of 10%, 20%, . . . ) at various distances [39,40]. Other information may also be available from instrument manufacturers. For example, without recording the full waveform, the online waveform processing of Riegl devices provides additional information in the extra byte associated with each point [42], but the computational details remain incomplete to the users. Correspondences between the derived radiometric quantities and the extra byte information for Riegl V-line and Q-line series are summarized in Table 1.
* available only for certain devices, ** may be derived from pulse deconvolution instead of Gaussian Decomposition (GD).

Radiative Transfer under Realistic Conditions
Under realistic conditions, ρ a,i is not equivalent to the natural reflectivity ρ i of a target. There are usually three cases of physical interactions that could occur ( Figure 1): Case A: single Lambertian target perpendicular to the pulse direction, i.e., s i = 0 and s p,i = s s is a constant. From Equation (4),P i is proportional to both I i and ρ a,i , and ρ a,i = ρ i .
Case B: single oblique Lambertian target, which is the most common case for TLS and small-footprint ALS. From Lambert's cosine law, I i = I i_perp · cos θ, where θ is the angle between the pulse and the target surface normal and I i_perp is the power integral in Case A. Therefore, ρ a,i = ρ i · cos θ if there is only a single return, which is consistent with findings by Hancock, et al. [19]. Standard deviation s i is a complex quantity that depends on the projection of the 2-D energy profile P l (β) onto the oblique target [41]. Unless s s s i , there is no direct link betweenP i and I i . Case C: a cluster of targets (i.e., clumped leaf volume and within-crown gap size smaller than the lidar footprint). The waveform presents a decayed profile from the peak as the laser penetrates through the cluster. The returns from several targets can merge to a single-return profile. In an ideal case: Remote Sens. 2020, 12, 4 6 of 31 Here, P gap is the gap probability of the canopy. T(Ω i ) · ρ·∆Ω i π is the backscatter transfer function along the direction Ω i , with T(Ω i ) defined as: where 2π is the leaf angle distribution of the cluster, and θ f,i is the angle between the leaf normal and Ω i . Combining Equation (7) with Equation (6) gives an expression for ρ a,i with three unknowns: Hence, the use of ρ a,i for the inversion of any of the unknowns requires valid hypotheses on the other two unknowns. Here, is the gap probability of the canopy. ( ) ⋅ ⋅ is the backscatter transfer function along the direction , with ( ) defined as: where ( ) is the leaf angle distribution of the cluster, and , is the angle between the leaf normal and . Combining Equation (7) with Equation (6) gives an expression for , with three unknowns: Hence, the use of , for the inversion of any of the unknowns requires valid hypotheses on the other two unknowns. Case A was the preliminary assumption by Wagner et al. [37]. It usually serves as an experimental setup for calibration. Actual interactions to generate small-footprint lidar points are either Case B or Case C. Both indicate that natural reflectivity cannot be derived from without knowing the type of interaction (number of returns, terrain slope, footprint size, leaf and gap size, etc.). Note that Case C has been extensively explored in the study for mid-to-large full-waveform lidar research [11][12][13][14]16,17]. For those cases, the waveform provides more lidar metrics than can be generated from small-footprint point clouds.

Theoretical Approaches
The gap probability over a canopy that depends on zenith angle can be approximated as ( ) = e ⋅ ⋅ ⋅ . Although the angular effect is an important factor [65,66], here we follow the majority of the past ALS work, which assumed a vertical direction. Considering only the vertical direction: Case A was the preliminary assumption by Wagner et al. [37]. It usually serves as an experimental setup for calibration. Actual interactions to generate small-footprint lidar points are either Case B or Case C. Both indicate that natural reflectivity ρ cannot be derived from ρ a without knowing the type of interaction (number of returns, terrain slope, footprint size, leaf and gap size, etc.). Note that Case C has been extensively explored in the study for mid-to-large full-waveform lidar research [11][12][13][14]16,17]. For those cases, the waveform provides more lidar metrics than can be generated from small-footprint point clouds.

Theoretical Approaches
The gap probability over a canopy that depends on zenith angle θ can be approximated as P gap (θ) = e −ω·G·LAI·cos θ . Although the angular effect is an important factor [65,66], here we follow the majority of the past ALS work, which assumed a vertical direction. Considering only the vertical direction: where G = 2π 2π cos θ f dΩ f is the unit leaf area projection along the vertical direction. For lidar, P gap and eLAI are typically derived from the laser penetration index (LPI) of either a single larger-footprint pulse or multiple small-footprint pulses.
For small-footprint ALS, the 3-D structure provided by lidar signal can further distinguish different components of the clumping [23]. For example, the clumping of a heterogeneous scene can be dominated by spatial distribution of tree crowns, shape of crown, and within-crown clumping. If the proportion of crown vertical projection area is defined as the vegetation canopy cover fraction f vcc by neglecting the within-crown gaps, then f vcc can be used to separate the between-crown spaces. We can derive an alternative expression for P gap : (11) where LAI is at landscape scale, and LAI/ f vcc is the LAI under the vegetation coverage. Note that ω in accounts for both the crown shape and the within-crown clumping. In Hu, et al. [23], f vcc was derived from lidar first returns by neglecting the lidar incident angle. Geometrical path length distribution was used to account for the crown shape contribution in ω in . Equation (11) provides a parsed format of Equation (10) specifically for small-footprint ALS, relying on the accuracy of direct estimations of f vcc and ω in from ALS data.

Practical Empirical Correlation
For ALS, the percentage of laser light penetration through a canopy [laser penetration index (LPI)] provides information about the density of foliage [67]. In practice, most of the derived values (such as LPI or eLAI) are empirically correlated with the field measurements, such as the gap probability/transmittance (e.g., [30][31][32], Morsdorf, et al. [33], Korhonen et al. [34]), the eLAI obtained by LAI-2000/hemispherical photos, or the LAI measured from direct methods (e.g., harvesting or destruction, etc.).The effective LAI and tree LAI of estimation can be expressed as: where G is either known or approximated. The relationships between the reference values and the estimated values can be described as: Note that α is not only the exponential coefficient to link LPI with P gap , but also the linear coefficient to link the estimated effective LAI (eLAI est ) with eLAI. For the ideal case α = 1, LPI, eLAI est , and LAI est are exactly equal to P gap , eLAI, and LAI, respectively, leaving one physical parameter ω as the slope to convert from eLAI est into the true LAI value. The statistical description used in Case C of Figure 1 [Equations (7)-(9)] for a single large-footprint pulse is also valid for the interpretation of integrated multiple small-footprint pulses within an area.

Review of LPI Computation
Explicitly, the LPI can be sampled by either the area fraction or power fraction of ground returns: where subscript indices v and g relate to vegetation and ground, respectively; P t is the incident lidar power on the targets; A proj is the projected area; γ is the ratio at which to convert vegetation return reflectance to ground return reflectance as if the pulse is intercepted by the ground. Distance R, apparent reflectance ρ a , and γ are all spatially aggregated.
According to the method of sampling area or intensity as illustrated in Equation (15), the approaches for estimating LPI through ALS are categorized as either point-number-based (PNB) or intensity-based (IB) in Table 2. Table 2. Methods for laser penetration index (LPI) estimation using airborne laser scan (ALS) points and references.

Point-Number-Based (PNB) Methods for LPI Computation
As shown in Table 2, given a small footprint size such that only a single return is retrieved from each pulse, and a high pulse density, the number of returns from leaves (N v ) and the ground (N g ) are statistical samplings of A v,proj and A g,proj in LPI all estimates. Monitoring larger regions with ALS requires a higher altitude, which makes the footprint size larger and increases the partial hit chance. LPI weighted balances the weight of each return from a pulse, where N t,i = N g,i + N v,i is the total number of returns per pulse with i = 1, 2, 3 . . . returns (N t,i = i), and N g,i is the corresponding number of ground returns. In older systems, limited to 2 or 4 returns per pulse, LPI first and LPI last were used, where N single counts the single returns from the ground (N g,single ) or from indistinguishable vegetation elements (N v,single ). LPI first tends to provide information on between-crown spaces, whereas LPI last contains information on both the between-crown spaces and the within-crown gaps. Usually, LPI first underestimates and LPI last overestimates P gap . LPI both balances this situation empirically, and has proven to be better sensitive to general gap sizes.

"Intensity"-Based (IB) Methods for LPI Computation
In theory, radiometric information is more suited than point density for assessing vegetation properties (e.g., LAI) [73]. Depending on the specifications of a certain device and on the user's choice, the "intensity" could be related toP, I, I , s p,i , or ρ a . However, in practice, IB methods have been used infrequently, partly due to the ambiguous radiometric quantities and the uncertain accuracies. The expression for intensity-based LPI (dependent on γ) is given in Table 2 and Equation (15). For a pulse fully intercepted by the ground, if the variation in the elevation slopes within the area is negligible and the surface is Lambertian, ρ a,g = ρ g · cos θ g , where θ g is averaged over the reconstructed terrain model [74,75]. For a pulse fully intercepted by the vegetation, by setting P gap = 0 in Equation (9), we get ρ a,v = ρ v · T(Ω i ). An expression for γ can be derived as: In previous studies, γ was usually assumed to be a constant. For example, γ = 0.5 was an empirical value given by Lefsky, et al. [72] for lidar devices with 1064 nm wavelength, but adopting the same value at 1550 nm can be inaccurate [36]. For typical ALS assumptions on horizontal terrain, vertical pulse incidence, and spherical leaf angle distribution, T(Ω i ) = 2/3, and: where ρ g and ρ v can be retrieved from ground measurements as prior values. Equation (17) is an ideal expression for any wavelength with measured ρ g /ρ v . Note that in Table 2, I is used instead of I for IB methods. Additionally, ρ a can replace I in Table 2 according to Equation (5), and both of them can represent the backscattering coefficient. Without any assumptions, the four unknowns in Equation (16) can vary across different areas. Thus, γ should not be universal. Indeed, the denominator of LPI γ ( Table 2) represents the cumulative I g of the ground as if there is no vegetation present. For full-waveform lidar, γ can be estimated with linear correlation using vegetation and ground backscattering coefficients [16,17]. γ was found to be relatively consistent with varying mid-to-large footprint size, but the correlation decreases with decreasing footprint size. In practice, the least-squares correlation is also strongly influenced by the diverse variances of I v and I g .
Milenković, et al. [64] proposed another method, which utilizes the nearest pure-ground pulse (pulse that generates a single ground return). All pulses and return intensities were classified into three types: pure-ground (intensity I pure g ), pure-vegetation, and vegetation-ground (with ground return intensity I g,v ). The returns of the second and third types were associated with the nearest first types (defined as I nearest v and I nearest g,v ). LPI nearest in Table 2 relies on the pure-ground pulses, and therefore requires large gaps to increase the probability of pure-ground pulses. Another critical assumption is that ρ a,g for both the pure-vegetation and vegetation-ground pulses are approximately equal to ρ a,g of the nearest pure-ground pulses. However, considering the ρ a,g calculation, ρ g could vary significantly due to understory conditions and woody debris, and θ g could be influenced by insufficient sampling, whereas more samples (including ground returns of vegetation-ground pulses) can be used for calculating θ g,i (from terrain model reconstruction) in LPI IB,γ estimation.

DART Simulations
The simulated point clouds of various critical parameters are studied and inter-compared with the LPI/LAI estimation approaches in Section 3. A DART scene takes 3-D cells to contain the Earth's elements for either turbid medium or facets [76]. We used homogeneous and heterogeneous forest scenes constructed from leaf facets to precisely model interactions with lidar pulses. Based on this approach, DART can directly calculate reference parameters for evaluating the inversion results.
The simulated lidar mimics the characteristics of Riegl VQ-480i (Table 3), a multi-return online waveform-processing lidar that is mounted on the G-LiHT platform [8]. The ground position of each pulse was derived from flight height, moving speed, pulse repetition rate, and scanning speed. The incident direction of each pulse was set to be vertical (θ g,i = 0). We focused more on the physical interactions themselves, following the majority of previous work, which neglected the angular effects, although it is an important factor for the gap fraction [65,66]. The sensor height R varied from 50 m to 1000 m, increasing the footprint diameter from 0.015 m to 0.30 m. At 1550 nm, natural reflectivities ρ v and ρ g were set to 0.243 (general leaf) and 0.340 (brown moss), respectively. The test scene consisted of a square vegetation plot and a flat ground surface (Figure 2a). The plot had a height of 12 m with a 2 m space below the canopy to facilitate the unambiguous classification of ground returns and vegetation returns. Each leaf was represented by a square (5 cm × 5 cm), and leaves within the canopy volume (22 m × 22 m × 10 m) had a spherical angle distribution. Under such ideal configuration, γ = 2.10 was computed from Equation (17). The scene LAI ranged from 1 to 6 for the sensitivity studies.  The test scene consisted of a square vegetation plot and a flat ground surface (Figure 2a). The plot had a height of 12 m with a 2 m space below the canopy to facilitate the unambiguous classification of ground returns and vegetation returns. Each leaf was represented by a square (5 cm × 5 cm), and leaves within the canopy volume (22 m × 22 m × 10 m) had a spherical angle distribution. Under such ideal configuration, = 2.10 was computed from Equation (17). The scene LAI ranged from 1 to 6 for the sensitivity studies.

Heterogeneous Scenes
A heterogeneous DART scene representing a general forest (100 m × 100 m) with randomly distributed ellipsoidal and conical crowns (327 of each) was used to test the lidar-derived LPI and LAI (Figure 2b). The crown height and diameter were set to 12 and 3.6 m, respectively. Two simulations were conducted with different leaf area densities set to 0.25 and 0.5 for each crown, resulting in LAI = 1 and 2, respectively, for the whole scene. The lidar data were simulated with 0.30 m footprint diameter. The LPI was calculated for each 10 m × 10 m area. The reference in this

Heterogeneous Scenes
A heterogeneous DART scene representing a general forest (100 m × 100 m) with randomly distributed ellipsoidal and conical crowns (327 of each) was used to test the lidar-derived LPI and LAI (Figure 2b). The crown height and diameter were set to 12 and 3.6 m, respectively. Two simulations were conducted with different leaf area densities set to 0.25 and 0.5 for each crown, resulting in LAI = 1 and 2, respectively, for the whole scene. The lidar data were simulated with 0.30 m footprint diameter. The LPI was calculated for each 10 m × 10 m area. The reference P gap in this case was estimated with the irradiance intercepted by the ground surface using a large number of nadir incident rays [77,78]. DART was able to generate the map of P gap for each area.
Within-crown leaf density variation was added to the heterogeneous scene of LAI = 2 ( Figure 2c) to study the influences of clumping and structural complexity on LAI retrieval. In DART, the non-random leaf distribution within the crown was parameterized for each species by the proportion of full cells and the vertical distribution of the leaf parameters. The simulated scene included 16 species, each with a different extent of clumping, to ensure enough variation of within-crown gaps. For each species, three layers with different vertical distributions of leaf density comprised each crown, determining the proportion over the whole canopy. The vertical distribution of leaf density and the proportion of full cells were randomly chosen for each layer of the crown. The LAI of each species were also randomly determined, with a total LAI of 2 for the whole scene. In addition, a 25% probability of empty 1 m × 1 m voxels was added into the crowns to further complicate the crown structure and clumping. The histogram of leaf area density is shown in Figure 2d.

Homogeneous Scene
The estimations of LPI in each 2 m × 2 m area from both the PNB and IB methods from Table 2 are illustrated in Figure 3, with various LAIs (1-6) and footprint diameters (0.015-0.30 m). The reference P gap is indicated with a dashed line for each subfigure.
The PNB methods to compute P gap across the homogenous scene have low accuracy and varied performances on LAI and footprint diameter. Persistently, LPI first underestimates P gap and LPI last overestimates P gap across all simulations. LPIs computed using other PNB approaches are in between these two values: LPI last > LPI both > LPI weighted or LPI all > LPI first , which is consistent with the previous studies referenced in Table 2. As the LAI increases, most of the PNB methods switch from underestimation to overestimation. For a fixed LAI, estimations with infinitesimal footprint size converge to P gap for all PNB methods, which tends to verify the capability of TLS to capture P gap accurately [79]. The average number of returns for each pulse approaches 1 (single return) as the footprint size approaches 0 (Figure 4a). With increasing footprint size across all simulations, LPI last , LPI first , and LPI both converge to saturated values of 1, 0, and 0.5, respectively. As the LAI increases, the saturation speed reduces due to the lower probability of energy penetration through the canopy (Figure 4b). Although LPI both balances the underestimation of LPI first and the overestimation of LPI last , the results do not indicate that LPI both is more accurate. By contrast, LPI weighted is closer to P gap than LPI all . The saturation speeds of LPI weighted and LPI all are slower than those of the other approaches. The total number of returns are illustrated in Figure 4a. The average number of returns per pulse gradually becomes saturated together with LPI weighted and LPI all , as shown in Figure 3. From the sensitivity study above, LPI weighted and LPI all have broader ranges of footprint size and LAI values than the other PNB methods.
Regarding the IB methods, the constant value of nearest pure-ground intensity I pure g for LPI nearest estimation is over-idealized for the simulations. Indeed, I pure g is a diverse variable for actual data due to soil moisture, litter, and terrain slope. For LPI fitted , the least-squares approach applied to the dataset of ρ a,g and ρ a,v of every single pulse is used to determine the fitted γ ( Figure 5). The scatter plots for LAI = 0.5, 2.0, and 5.0 are shown in Figure 5a-c, respectively, with increasing footprint diameter (0.06 m, 0.15 m, and 0.30 m, respectively). The fitted line determines the slope value (−γ), the y-axis intersection ( ρ a,g with a reference value of 0.340), and the x-axis intersection [ ρ a,g with a reference value of 0.162 from Equations (16) and (17)]. It should be noted that with increasing footprint diameter, the fitted γ value and the axial intersections converge to the reference values. The data variance decreases as correlation increases in Figure 5d,e. These results are consistent with [16] and [3], in which γ should be estimated using a larger footprint dimension with smaller variance for LAI retrieval (e.g., the spaceborne waveform lidars [2,80]). Note that the simulation might not represent the realistic conditions because of the constant reflectivity of the ground surface.
Remote Sens. 2019, 11, x FOR PEER REVIEW 12 of 32 simulation might not represent the realistic conditions because of the constant reflectivity of the ground surface. The three IB methods, LPI . , LPI , and LPI , generally gave accurate estimations of at all footprint sizes ( ≈ 1.0), except that LPI underestimated at small footprint sizes as expected. Due to the over-idealized simulation conditions, LPI was suitable for all footprint sizes, regardless of the structural and optical properties of the vegetation. In general, IB methods are preferable for variation of LAI values. As the LAI increases, the ground returns become weaker than the vegetation returns, but pulses with larger footprint size can still produce ground returns. Thus, the average numbers of returns for different LAI values converge to the same value in Figure 4.   The empirical coefficient ( ) in Equations (13) and (14) was studied for various estimation approaches with 1 (homogeneous scene). Figures 6 and 7 illustrate the correlation between the references and the estimations using different methods. The results from the homogeneous simulation are consistent with previous assertions mentioning that LPI derived from PNB methods requires to correlate with field measurements of [16,17]. For the small footprint diameter of 0.03 m, which mimics a TLS configuration, both exponential and linear correlations are successfully established for all the approaches. For the IB methods, 1.0 is observed for LPI and the LAI was derived for LPI . and LPI . LPI approached as the footprint size increased. For PNB methods, LPI and LPI were more accurate than the other PNB methods, which give either underestimations ( 3.38, 1.67) or overestimations ( 0.37). In Figure 7, the  The three IB methods, LPI γ=2.10 , LPI nearest , and LPI fitted , generally gave accurate estimations of P gap at all footprint sizes (α ≈ 1.0), except that LPI fitted underestimated P gap at small footprint sizes as expected. Due to the over-idealized simulation conditions, LPI nearest was suitable for all footprint sizes, regardless of the structural and optical properties of the vegetation. In general, IB methods are preferable for variation of LAI values. As the LAI increases, the ground returns become weaker than the vegetation returns, but pulses with larger footprint size can still produce ground returns. Thus, the average numbers of returns for different LAI values converge to the same value in Figure 4.
The empirical coefficient (α) in Equations (13) and (14) was studied for various estimation approaches with ω = 1 (homogeneous scene). Figures 6 and 7 illustrate the correlation between the references and the estimations using different methods. The results from the homogeneous simulation are consistent with previous assertions mentioning that LPI derived from PNB methods requires α to correlate with field measurements of P gap [16,17]. For the small footprint diameter of 0.03 m, which mimics a TLS configuration, both exponential and linear correlations are successfully established for all the approaches. For the IB methods, α ≈ 1.0 is observed for LPI and the LAI was derived for LPI γ=2. 10 and LPI nearest . LPI fitted approached P gap as the footprint size increased. For PNB methods, LPI weighted and LPI all were more accurate than the other PNB methods, which give either underestimations (α last = 3.38, α both = 1.67) or overestimations (α first = 0.37). In Figure 7, the LAI correlations are almost linear for the PNB methods with LAI > 1, and converge to the origin with varying α as the LAI value approaches 0, which is caused by the boundary partial hit effect due to the non-negligible ratio of footprint size to leaf size. The results also indicate that LPI weighted is preferable among all the PNB methods. Regarding larger footprint diameters (0.15 and 0.30 m) that mimic regular ALS configurations, α ≈ 1.0 is still maintained for the IB methods. By contrast, all the PNB methods demonstrate low sensitivities for both P gap and LAI correlations. LPI first and LPI last are not seen from Figure 7 because ln LPI −1 is consistently equal to infinity and 0, respectively. LPI both shows a negligible sensitivity as the LAI varies. Unexpected non-linear correlations can be observed for LPI weighted and LPI all with LAI < 3 (0.15 m footprint) and LAI < 1.5 (0.30 m footprint). Indeed, the merging of returns with high leaf density and large LAI (Case C of Figure 1) makes the linear correlation of PNB methods distorted and inappropriate for capturing the small gaps within the crown to give accurate estimations. Indeed, the merging of returns reduces the point number, but the reflectance would be cumulated into the merged point. Therefore, IB methods are not influenced by this effect. In a forest, foliar dimensions can vary a lot with tree species and growing stages. Even for the same LAI and footprint size, variation in foliar dimensions can influence the estimated LPI. Figure 8 shows the relative bias and sensitivity of lidar-derived LPIs (LAI fixed at 3) under different footprint sizes and foliar dimensions of the homogeneous scene. LPI (the most reliable PNB method, Figure 8a and c and LPI . (the idealized IB method, Figure 8b,d are studied here. LPI . is much less sensitive to leaf dimensions than LPI . We used the ratio of footprint size to leaf size to characterize the changes in accuracy. Figure 8c,d give the relative bias against the ratios for LPI and LPI . , respectively. LPI approaches the references when the ratio In a forest, foliar dimensions can vary a lot with tree species and growing stages. Even for the same LAI and footprint size, variation in foliar dimensions can influence the estimated LPI. Figure 8 shows the relative bias and sensitivity of lidar-derived LPIs (LAI fixed at 3) under different footprint sizes and foliar dimensions of the homogeneous scene. LPI weighted (the most reliable PNB method, Figure 8a and c and LPI γ=2.10 (the idealized IB method, Figure 8b,d are studied here. LPI γ=2.10 is much less sensitive to leaf dimensions than LPI weighted . We used the ratio of footprint size to leaf size to characterize the changes in accuracy. Figure 8c,d give the relative bias against the ratios for LPI weighted and LPI γ=2.10 , respectively. LPI weighted approaches the references when the ratio approaches 1.0, which can also be confirmed by the 0% difference slope in Figure 8a and the plot of LAI = 3 in Figure 3. LPI and P gap are close at 0.05 m footprint diameter but diverse for larger footprint size and for LAI = 1 (underestimation) or LAI = 5 (overestimation) as illustrated in the other plots of Figure 3. This suggests that LPI weighted is sensitive to the footprint size and foliar dimensions when it is used to estimate small within-crown gaps. The bias plot of LPI γ=2.10 converges to less than 3% when the ratio of footprint diameter to leaf length becomes larger than 6, which ensures that a lidar pulse covers an adequate number of foliar elements. Indeed, the relative error significantly increases with decreasing ratio value.

Heterogeneous Scene
For the heterogeneous scenes, landscape-scale areas (10 m × 10 m) were studied. Figure 9 and Table 4 show the correlation between the lidar-derived LPIs and the reference values (P gap and LAI), with a relatively large footprint diameter (0.3 m). For the PNB methods, the correlation depends more on the large gaps between crowns due to the sensitivity loss for the within-crown gaps at 0.3 m footprint size, as illustrated previously in Figure 6. We estimated the reference value of f vcc by thresholding the generated P gap map of 0.1 m × 0.1 m pixel size into a binary image of 1 (open space) and 0 (canopy cover), and computing the fraction within each area. Table 4 illustrates the results of correlations of various PNB and IB methods in terms of four derived parameters: α is the fitted exponential coefficient of LPI against P gap (Figure 9a-c) from Equation (13); α · ω −1 is the fitted linear correlation coefficient of reference LAI against eLAI est described by Equations (14) and (12) (slopes of Figure 9d-f); computed ω is generated by dividing α by α · ω −1 ; and R 2 is the coefficient of determination of reference LAI against eLAI est . The last column of Table 4 provides the reference values computed by using P gap as LPI. Additionally, derived results of both f vcc and P gap are shown in Figure 9 for comparison. Table 4. Derived and computed coefficients for the PNB and IB methods from Figure 9 and Equations (13) and (12). The reference derivations from P gap are in bold. For the derived α of Table 4 from Figure 9a-c, all of the IB methods (LPI nearest , LPI γ=2.10 , and LPI fitted ) were consistent with P gap with α ≈ 1 for varying LAI and within-crown clumping (bias < 3%). The varying leaf area over the entire scene benefits the building of the slope γ for LPI fitted , as illustrated in Figure 10. The differences between fitted γ were less than 0.5%, but the introduction of within-crown clumping reduced the fitted γ by 5%, which suggests that within-crown clumping increases the variance of the dataset. The correlation coefficient remained high for all conditions. For the PNB methods, all LPI estimates are strongly biased towards the reference P gap . LPI first severely underestimates P gap because of the small gaps within the crown. LPI weighted tends to give a more accurate estimation than LPI first . When the LAI varies from 2 to 1, LPI weighted provides an underestimation with α varying from 0.97 to 0.61. LPI last provides a strongly saturated overestimation with low sensitivity for LAI = 2, and loses its sensitivity for LAI = 1. LPI both averages the inaccuracy of both LPI first and LPI last . LPI all introduces a large offset. Additionally, LPI first slightly underestimates f vcc (red diamond scatter plot in Figure 9a,c) because of the multiple returns at the boundary of the crown, but it presents a feasible approach if the lidar incident angle is neglected.

Variables
In Figure 9d-f, according to Equation (12), the eLAI est from various approaches were linearly correlated with the reference LAI. Unlike in the homogeneous scene, ω is a non-negligible physical unknown. Therefore, eLAI est is used instead of LAI est . The slope α · ω −1 and R 2 values are shown in Table 4. Compared with the reference values derived from P gap , the three IB methods give results that are close to the references (bias < 3% and R 2 > 0.89). The estimations using all the returns (LPI all , LPI weighted , LPI γ=2.10 , and LPI nearest ) are closer to the references than the estimations partially using the points (LPI first , LPI last , and LPI both ). LPI first was insensitive to the changes in the LAI and clumping. Due to the non-linearity of Equation (11), explicit analytical references of α · ω −1 do not exist. It should be noted that although the clumping-parsed expression of Equation (11) does not show a linear relationship, most linear regression studies show high R 2 values in this study range, which verifies the past work in linear fitting of field measurements for LAI map generation and insensitive correlation of LPI last in estimating the large gaps between crowns. (13) and (12 (e) (f) Figure 9. Plots of against lidar-derived LPIs (a-c) and reference LAIs against eLAI (d-f) (estimated from Equation (12)) for a simulated scene with discrete tree crowns (LAI = 1 and 2) and within-crown clumping (LAI = 2). The slope ( ⋅ ) and R 2 values are shown in Table 4.
For the derived of Table 4 from Figure 9a-c, all of the IB methods (LPI , LPI . , and LPI ) were consistent with with ≈ 1 for varying LAI and within-crown clumping (bias < 3%). The varying leaf area over the entire scene benefits the building of the slope for LPI , as illustrated in Figure 10. The differences between fitted were less than 0.5%, but the introduction of within-crown clumping reduced the fitted by 5%, which suggests that within-crown clumping increases the variance of the dataset. The correlation coefficient remained high for all conditions. For the PNB methods, all LPI estimates are strongly biased towards the reference . LPI severely underestimates because of the small gaps within the crown. LPI tends to give a more Figure 9. Plots of P gap against lidar-derived LPIs (a-c) and reference LAIs against eLAI est (d-f) (estimated from Equation (12)) for a simulated scene with discrete tree crowns (LAI = 1 and 2) and within-crown clumping (LAI = 2). The slope (α · ω −1 ) and R 2 values are shown in Table 4.
With derived α and α · ω −1 from the P gap and LAI correlation studies, the scene clumping index ω is computed by dividing these two values. The reference ω computed using P gap and input LAI ranges from 0.45 to 0.68 in the simulations. Among all of them, the addition of within-crown clumping reduces ω to between 0.51 and 0.45. For the IB methods, the computed ω is consistent with high accuracy (bias < 2%). Indeed, ω is a physical quantity that should not vary with estimation approach. In general, the PNB methods produce various computed ω values, which suggests that most of the PNB methods have influences not only on an empirical correlation coefficient (i.e., α), but also on other physical quantities (i. e., ω). Surprisingly, among all the PNB methods, LPI weighted gave a close estimation of ω with bias < 4%, although the fitted α could be far away from 1.0. Therefore, LPI weighted partially interprets ALS data physically, and it is preferable over all the other PNB methods.
Remote Sens. 2019, 11, x FOR PEER REVIEW 20 of 32 In Figure 9d-f, according to Equation (12), the eLAI from various approaches were linearly correlated with the reference LAI. Unlike in the homogeneous scene, is a non-negligible physical unknown. Therefore, eLAI is used instead of LAI . The slope ⋅ and values are shown in Table 4. Compared with the reference values derived from , the three IB methods give results that are close to the references (bias < 3% and > 0.89). The estimations using all the returns (LPI , LPI , LPI . , and LPI ) are closer to the references than the estimations partially using the points (LPI , LPI , and LPI ). LPI was insensitive to the changes in the LAI and clumping. Due to the non-linearity of Equation (11), explicit analytical references of ⋅ do not exist. It should be noted that although the clumping-parsed expression of Equation (11) does not show a linear relationship, most linear regression studies show high values in this study range, which verifies the past work in linear fitting of field measurements for LAI map generation and insensitive correlation of LPI in estimating the large gaps between crowns. With derived and ⋅ from the and LAI correlation studies, the scene clumping index is computed by dividing these two values. The reference computed using and input LAI ranges from 0.45 to 0.68 in the simulations. Among all of them, the addition of within-crown clumping reduces to between 0.51 and 0.45. For the IB methods, the computed is consistent In addition to the experimental and environmental variables, instrumental specifications (e.g., the reflectance threshold ρ a ) can also influence the LPI estimation, and this is analyzed in Appendix B. The studies in Section 4.2 assumed that every infinitesimal peak from the waveform could be detected by the sensor.

Discussion
From a homogeneous layer of randomly scattered leaves, we found the point-based LPIs computed by small-footprint ALS either overestimate or underestimate P gap , and they become saturated relying on two factors: (1) the ratio of footprint diameter to leaf or vegetation element dimensions and (2) the LAI. The former factor determines a pulse's capability in generating multiple returns, and the latter factor determines the number of returns per pulse. To apply PNB methods over a homogeneously dense forest, the footprint size should be controlled within a range so that the sensitivity is maintained as the LAI decreases and the LPI estimations do not converge to a constant value (e.g., 1, 0.5, 0, etc.). The estimation of this range requires a deep understanding of the relationship between leaf dimensions, leaf density, and device specifications. As for increasing footprint size, the estimated P gap from the PNB methods becomes insensitive to LAI or leaf area density variation as well as within-crown clumping, even with the path length variation considered [23,27]. Sensitivity studies demonstrated that IB methods for physically estimating P gap and eLAI were accurate and less influenced by variations in footprint size, leaf area, vegetation cover, and foliar dimensions than the PNB methods. From the modeling results, the main advantage of the IB methods is that the empirical coefficient α can be removed from Equations (13) and (14), leaving only the physical quantities (i.e., ω) for LAI estimation if the threshold of ρ a is neglected (which could introduce bias as presented in Appendix B). They also have flexibility of use in a wide variety of instrumental, experimental, and environmental configurations, which cannot be achieved by PNB methods unless the footprint size is infinitesimally small (i.e., TLS configuration).
We have shown that some of the PNB methods are only sensitive to the large gaps, and the IB methods can capture the precise P gap of various gap sizes. Among the radiometric quantities derived from lidar, I i and ρ a are the only suitable quantities for inversion. In practice, lidar point intensity has more unresolved uncertainties than point number. The ground reflectance variance can be large due to moisture, litter, and low vegetation and micro-topography. Also, a LAS file stores the intensity values as 8-bit or 16-bit digital numbers, and this discretization may introduce further precision issues. Additionally, most lidar devices captureP as the intensity value, which is unsuitable for the IB methods. For example, for Riegl lidar devices, most of the past full-waveform studies used Q-line with full-waveform storage from which all the radiometric quantities can be derived. Our study supports the potential of V-line using online waveforms processed into apparent reflectance ρ a [38][39][40] with lower cost, smaller divergence, higher pulse repetition frequency, and more efficient data storage without waveforms. For the IB methods, the constant γ assumption in estimating LPI λ heavily relies on the statistically convergent leaf angle projection, clumping index, and natural reflectivity ratio of the leaves and the ground within a region. The understory vegetation and debris can influence the ground reflectivity and the fitting of γ. The full-waveform method used by Armston, et al. [16] and Chen, et al. [17] is a practical approach without empirical correlation. However, the method needs a different least-squares approach to consider the inconsistent variances over the dataset. The LPI nearest could be another practical approach, but possible sources of bias exist due to insufficient single-return ground pulses and the possible large variance of ground reflectance. Furthermore, the influences of the threshold of ρ a should be noted (Appendix B).
Based both on the modeling results in this study and on previous studies (e.g., [16,17,81]), estimation of LAI from small-footprint ALS point clouds without field measurements is promising. We investigated the dependence of the linear fitting coefficient α. From both homogeneous and heterogeneous simulations, we have shown that P gap can be estimated without the correlation coefficient α. For the heterogeneous scene, the linear coefficient α · ω −1 is analyzed, with a parsed expression of ω provided in Equation (11). eLAI is divergent from the actual LAI, for which precise estimation requires the inputs of vegetation cover fraction ( f vcc ), shape, and within-crown clumping index (ω in ) and G in Equation (11). Small-footprint ALS has a great advantage in capturing the structures of tree crowns. Although our simulation shows that f vcc can be estimated by LPI first based on the vertical pulse assumption, the actual lidar incident angle would introduce non-negligible biases. Considering the influences of both crown shape and within-crown clumping, ω in could be resolved with aggregating pulses within a small area to estimate the within-crown optical depth distribution instead of only the path geometrical length distribution [25]. Another difficult configuration is the footprint size, which should be larger than the within-crown gaps but smaller than the between-crown spaces, to capture: (1) the f vcc through first returns to exclude the within-crown gaps and (2) the light penetration within the crown through multi-return pulses to exclude the between-crown spaces. Fortunately, most of the recent ALS data are considered satisfactory (e.g., Cook, et al. [8]). Therefore, the LAI map product could be provided more accurately with a proper method. The ray-tracing-based voxel reconstruction (e.g., AMAPVox [81]) from ρ a could potentially be a solution in resolving f vcc and ω in together. The precise estimation of G and ω in might require a multi-ALS scan from different directions.
In this study, we took Riegl lidar configurations with extra byte storage as typical examples. However, all ALS data are diverse due to the sensor specifications and environmental configurations. The acquisition methods, configurations, and point-processing algorithms from different manufacturers are also diverse. The appropriate way to understand the capability of a device requires physical modeling instead of directly applying approaches that were previously verified with only a single experiment. There are several other important effects not included in this study, such as the angle distribution of incident pulses, the woody parts of realistic trees including twigs, as well as the sensor noise, dead-time, out-of-focus effect, etc. These effects will be considered in future work in which more complex approaches (e.g., ray tracing) are involved. In this work, we specifically focused on point representation through basic and idealized experimental modeling.

Conclusions and Outlook
In this work, we conducted an in-depth study of gap probability and LAI estimates from small-footprint ALS point clouds with sensitivity studies to evaluate the methods based on either point number (PNB) or intensity (IB), using improved DART point-cloud-modeling capabilities. The simulated scene configurations include both homogeneous vegetation plots and heterogeneous forest, with clumping effects existing at both the landscape and crown scale.
Apart from the clumping index with a clear physical definition, the PNB methods require additional empirical correlation coefficients that rely on field measurements. The correlation could become saturated, with diminished sensitivity, relying on the relationship between the ratio of footprint diameter to leaf dimension and the LAI. Among all the PNB methods, the LPI weighted approach [28,35,68] is preferable because it has the broadest effective range in adapting various configurations (footprint size, LAI, and clumping), and is the most physically based, which preserves the valid clumping index without the influence of correlation (Table 4). In contrast, all the IB methods listed in Table 2 have the great advantage over the PNB methods of being able to accurately estimate the gap probability without the requirement of a correlation coefficient [α ≈ 1 in Equations (13) and (14)], given that an appropriate radiometric quantity of lidar point is used (either the distance-weighted power integral I or the apparent reflectance ρ a of each return). The IB methods remain accurate in the presence of both landscape and within-crown clumping. However, despite the empirical correlation coefficient, the clumping index ω that is required for true LAI estimation remains a challenge to estimate merely from lidar points. Further parsed analysis of clumping to figure out the separated contributions by landscape, crown shape, and within-crown variation requires new approaches to be developed, potentially by combining voxel-based ray tracing with well-defined radiometric quantity of ALS points.
For actual data processing, the choice of PNB methods or IB methods should be analyzed case by case depending on whether field measurements are available and whether the uncertainty or noise of using point radiometric quantity is less than those of using point number. The potential of using abundant ALS point-cloud resources could provide additional validations and pre-studies of existing and future spaceborne platforms. The point-cloud module has been implemented in the latest DART release, which is also capable of TLS simulation (versions higher than 5.7.0). The recent development of the DART lidar module supports the conversion of DART-simulated waveforms into point clouds, with storage of waveforms and point-cloud data in the standard LAS format with the extra byte output.
These advances bridge the gap between simulated data and current data-processing software, to improve the usage of lidar points for assessing the accuracy of ALS and TLS inversions. Our next work will be on evaluations in 3-D point voxelization software through DART ALS and TLS simulations: e.g., AMAPVox [81], VoxelSD [28], and VoxLAD [29,82]. DART provides free licenses for scientific and educational work.

Conflicts of Interest:
The authors declare no conflict of interest.
Receiver telescope area A s Footprint area at a certain distance A proj Projection area C cal A calibration constant with pre-defined instrumental and experimental·configurations d l The beam cross-section diameter at the "exit gate" of the laser generator D r Diameter of the receiver telescope eLAI Effective LAI f vcc Proportion of crown vertical projection area by neglecting the within-crown gaps g(Ω f ) Leaf angle distribution function G Unit leaf area projection along the pulse direction I Integral of a return Gaussian power profile, total power of a return I The distance-weighted power integral (I · R 2 ), proportional to ρ a I The temporal standard deviation of a returned Gaussian profile of return index i s i The temporal standard deviation broadening of a return Gaussian profile (index i) from target interaction s β The standard deviation of the angular energy distribution within footprint T(Ω i ) Back-scatter transfer function of Ω i of return index i α Correlation coefficient to link LPI with P gap (exponential) and eLAI est with eLAI (linear) β The angular offset from the pulse direction β t Footprint divergence γ Ratio of ground apparent reflectance over vegetation apparent reflectance σ Radar back-scattering differential cross-section ρ Natural reflectivity of a target ρ a Apparent reflectance of a target Ω i , ∆Ω i Direction and solid angle from target to the receiver of return index i Ω f Leaf normal direction η System and atmospheric transmission factor ω The overall clumping index for LAI estimation over an area ω in Index of clumping that is induced by only crown shape and within-crown clumping θ g The angle between ground normal and the incident direction of a pulse Figure A1 shows the workflow for simulating lidar point clouds. For the first step, DART reads a 3-D scene (i.e., an abstract description or field of 3-D objects) with user-defined experimental configurations for lidar acquisition (e.g., altitude, platform trajectory, pulse divergence, scan density, control points, etc.). The stages enclosed in the dashed box are DART internal processes. DART uses a quasi-Monte Carlo ray-tracing approach to model the waveform for each pulse [48]. DART-simulated waveforms can either be exported as a binary file, or processed as points with GD before being exported as a text file. Remote Sens. 2019, 11, x FOR PEER REVIEW 25 of 32 Figure A1. Workflow of DART outputs: discrete points in text format or discrete points with associated waveforms in LAS format.

Output exportation to LAS:
The DART-simulated waveforms are stored as a binary file. In the next step, this binary file can be converted into standard lidar data formats to adapt to the existing software. Previous work documented the output conversion into sorted pulse data (SPD) format [58,83]. Here, we implement an approach that processes the binary file into an LAS 1.3 point cloud through the LASPY library Figure A1. Workflow of DART outputs: discrete points in text format or discrete points with associated waveforms in LAS format.

yOutput exportation to LAS:
The DART-simulated waveforms are stored as a binary file. In the next step, this binary file can be converted into standard lidar data formats to adapt to the existing software. Previous work documented the output conversion into sorted pulse data (SPD) format [58,83]. Here, we implement an approach that processes the binary file into an LAS 1.3 point cloud through the LASPY library [84]. The points are stored in an LAS file, with the possibility to export the waveforms to another WDP file that is linked to each decomposed point [Header description in [85], point type 1 or 4]. The file is supported by existing LAS visualization and processing software (e.g., CloudCompare [86]).

Internal waveform processing:
Waveforms are decomposed into points in DART code that are saved as matrices in a text file. Each line represents a point. Columns store 3-D position and "intensity" information, including peak amplitude (P), temporal standard deviation (s p,i ), integral (∝P · s p,i ), apparent reflectance (ρ a ), return index, number of returns per pulse, etc.
The dual-output option provides great flexibility in different applications. For example, Option 1 supports the validation of algorithms that work with ALS waveform data in LAS format (e.g., Riegl's Q-line devices). Option 2 may be preferable for numerous points (e.g., Riegl's V-line devices with online waveform processing). In that case, storing waveforms in a binary file would require tremendous unnecessary computer memory and hard disk space. Additionally, users may want to investigate the sensitivity of various input instrumental variables (d l , β t , . . . ) or output intensity selections (P, σ, or P · σ . . . ) that can be stored in the extra bytes of LAS format for different applications (e.g., classification, leaf angle distribution inversion, etc.).

Appendix B Influence of Peak Detection Threshold
For actual devices, the parameters that determine whether a return can be detected by a lidar device are R and ρ a , due to the sensor capability. In this section, the points produced by GD are filtered with threshold ρ a > 0.009, which is a typical value retrieved from the point cloud generated by Riegl VQ-480i. Figure A2 illustrates the LPI calculated by the filtered point cloud under different LAIs for the homogeneous scene (LAI = 1, 3, and 6). Compared to the unfiltered LPIs illustrated in Figure 3, the filtered LPIs have a larger variance. When the LAI is small, the vegetation return usually has lower ρ a , which may be filtered out. This causes an overestimation by LPI γ=2.10 , as some energy from vegetation backscattering has been removed. LPI fitted is also influenced by the large divergence of dataset variance. The bias curve gives a parallel displacement. LPI nearest still provides accurate estimations due to the idealized ground reflectance and the adaptation with ρ a filtering. For the PNB methods, the estimated LPIs are slightly lower than the ones from the unfiltered point cloud, because the average number of returns has been reduced ( Figure A3). From Figure A3, the average number of ground returns decreases with increasing LAI, which means there are more pure-vegetation pulses. For the heterogeneous scene, the results are shown in Figure A4 and Table A1. These results correspond to a potential reason for the occlusion effect in ALS applications [87]. , which may be filtered out. This causes an overestimation by LPI . , as some energy from vegetation backscattering has been removed. LPI is also influenced by the large divergence of dataset variance. The bias curve gives a parallel displacement. LPI still provides accurate estimations due to the idealized ground reflectance and the adaptation with filtering. For the PNB methods, the estimated LPIs are slightly lower than the ones from the unfiltered point cloud, because the average number of returns has been reduced ( Figure A3). From Figure A3, the average number of ground returns decreases with increasing LAI, which means there are more pure-vegetation pulses. For the heterogeneous scene, the results are shown in Figure A4 and Table A1. These results correspond to a potential reason for the occlusion effect in ALS applications [87].     Figure A4. Reference LAI plotted against ln LPI −1 for the heterogeneous scene after filtering (LAI = 1 and 2) under footprint size 0.3 m: (a,b) Correlation between reference LPI and lidar-derived LPI; (c,d) Correlation between reference LAI and eLAI est from Equation (12).