Estimation of Forest LAI Using Discrete Airborne LiDAR: A Review

: The leaf area index (LAI) is an essential input parameter for quantitatively studying the energy and mass balance in soil-vegetation-atmosphere transfer systems. As an active remote sensing technology, light detection and ranging (LiDAR) provides a new method to describe forest canopy LAI. This paper reviewed the primary LAI retrieval methods using point cloud data (PCD) obtained by discrete airborne LiDAR scanner (DALS), its validation scheme, and its limitations. There are two types of LAI retrieval methods based on DALS PCD, i.e., the empirical regression and the gap fraction (GF) model. In the empirical model, tree height-related variables, LiDAR penetration indexes (LPIs), and canopy cover are the most widely used proxy variables. The height-related proxies are used most frequently; however, the LPIs proved the most efﬁcient proxy. The GF model based on the Beer-Lambert law has been proven useful to estimate LAI; however, the suitability of LPIs is site-, tree species-, and LiDAR system-dependent. In the local validation in previous studies, poor scalability of both empirical and GF models in time, space, and across different DALS systems was observed, which means that ﬁeld measurements are still needed to calibrate both types of models. The method to correct the impact from the clumping effect and woody material using DALS PCD and the saturation effect for both empirical and GF models still needs further exploration. Of most importance, further work is desired to emphasize assessing the transferability of published methods to new geographic contexts, different DALS sensors, and survey characteristics, based on ﬁguring out the inﬂuence of each factor on the LAI retrieval process using DALS PCD. In addition, from a methodological perspective, taking advantage of DALS PCD in characterizing the 3D structure of the canopy, making full use of the ability of machine learning methods in the fusion of multisource data, developing a spatiotemporal scalable model of canopy structure parameters including LAI, and using multisource and heterogeneous data are promising areas of research.


Introduction
The leaf area index (LAI) is a critical indicator of vegetation status. It is widely used in many fields, such as earth science, agronomy, ecology, and forestry [1], as an essential input variable to the land surface process model [2,3]. LAI is a dimensionless variable (or m 2 /m 2 ), and its concept has progressed through a long process of definition and clarification [4][5][6][7]. In quantitative remote sensing, the most popular and widely accepted definition of LAI refers to half of the total leaf area per unit ground area [6].
Direct or indirect measurements on the ground and remote sensing inversion from remotely sensed data are the primary methods used to determine LAI. Direct measure-Remote Sens. 2021, 13, 2408 2 of 17 ment methods consist of destructive sampling, litterfall collection [8], and point quadrat sampling [9][10][11]. The indirect methods employ non-contact optical equipment to measure the canopy gap fraction (GF), i.e., the physical GF [12], e.g., by LAI-2200C, or geometric GF [13], e.g., by digital hemisphere photo (DHP), and estimate the canopy LAI using the Beer-Lambert law. More details of these indirect methods can be found in the recent review by Yan [14]. However, both ground-based direct and indirect measurements are laborious, even with the new generation Internet of Things-based equipment, e.g., LAINet [15], and may be infeasible due to spatial accessibility.
Remote sensing is the only feasible method to retrieve LAI on a large or even global scale [16]. Several global LAI products have been made publicly available in the last decade based on passive remote imagery, e.g., MODIS-LAI [17], GLOBCARBON-LAI [18], and CYCLOPES-LAI [19]. However, such a passive remote sensing-based LAI may suffer from spectral saturation and poor atmospheric conditions [20,21]. Moreover, it can only capture horizontally distributed features of the canopy in a fixed spatial resolution depending on the specific sensor [22]. In addition to passive remote, a type of active remote sensing sensor, light detection and ranging (LiDAR) has been increasingly used for LAI estimation [20,[23][24][25]. Because of its ability to penetrate the canopy, LiDAR has been proven to mitigate signal saturation to some extent [26,27]. In addition, LiDAR can simultaneously characterize the horizontal and vertical structure of canopy [28,29] at multiple spatial scales, such as individual trees, sample plots, or areas [30]. Furthermore, as a technique that is essentially different from passive remote sensing, LiDAR-derived LAI provides a new independent source for assessing the LAI retrieval model based on passive remote sensing [31][32][33][34].
Different LiDAR systems, e.g., full-waveform [35] and discrete LiDAR systems [20], have been employed for LAI retrieval. The proposed methods for different types of LiDAR systems have been comprehensively reviewed recently by Wang from a more frameworkoriented view [36]. However, to our knowledge, there is still a lack of a more specific review aiming at the LAI retrieval methods for discrete airborne LiDAR scanners (DALS) which is one of the most popular types of LiDAR systems in forest resource surveys [37][38][39]. Since the commercialization of DALS LiDAR systems at the beginning of this century, point cloud data (PCD) obtained by DALS has been widely used to estimate LAI [20,23,25,27,[39][40][41][42][43][44]. However, published literature has shown that the accuracy of LAI inversion based on DALS PCD (typically, RMSE > 0.5 [25]) can barely meet the requirements of the global climate change research community (GCOS) who demands the inversion accuracy of LAI to be 0.5 LAI units [45]. The gap between the accuracy of LiDAR LAI products and application requirements requires an in-depth understanding of LAI estimation methods based on DALS PCD. We believe that this topic must be investigated to guide selecting an optimal method from various candidate methods when conducting LAI estimation using DALS PCD. In this context, this paper expects to provide a review of the LAI retrieval methods using DALS PCD, the validation scheme and limitations of those methods, and future development trends in this field.

An Overview of the DALS System
The basic principle of different LiDAR systems is similar: LiDAR scanner emits a pulse and measures the distance between the sensor and ground object. The distance, subsequently, is converted into the geographic coordinates of the object, combining with GPS and IMU (Inertial Measurement Unit) mounted onto the platform together with the LiDAR scanner. Current LiDAR systems can be categorized as spaceborne laser scanning (SLS), airborne laser scanning (ALS), terrestrial laser scanning (TLS), mobile (mobile laser scanning, MLS) and unmanned aerial vehicle (UAV) laser scanning (ULS) based on platforms [46] or as discrete, full-waveform and photon-counting LiDAR based on their sampling frequencies [42]. More details of these types of LiDAR systems can be found in [46].
For discrete airborne laser scanners (DALS), the size of the footprint is small (typically < 1 m [46]). The LiDAR beam can penetrate the canopy through the gaps in the canopy, Remote Sens. 2021, 13, 2408 3 of 17 which is accompanied by the expanding of the footprint and the attenuation of the pulse energy as illustrated in Figure 1a. In Figure 1a, the width of the path represents the gradually expanding footprint, and the gradient fill of the path represents the attenuation of pulse with the propagation of beam in the canopy. Along with the propagation of pulse in the canopy, multi-return from objects in the path of the pulse can be recorded (typically < 7 for DALS [42]) until the pulse is entirely occluded by the canopy component (e.g., # 2A in Figure 1a) or finally reaches the ground (e.g., # 3C in Figure 1a) [47]. In addition to the geographic coordinates of each return, PCD in LAS format [48] also records information such as the number of returns (NRs), return number (RN), and intensity of each return, etc. opy, which is accompanied by the expanding of the footprint and the attenuation of the pulse energy as illustrated in Figure 1a. In Figure 1a, the width of the path represents the gradually expanding footprint, and the gradient fill of the path represents the attenuation of pulse with the propagation of beam in the canopy. Along with the propagation of pulse in the canopy, multi-return from objects in the path of the pulse can be recorded (typically < 7 for DALS [42]) until the pulse is entirely occluded by the canopy component (e.g., # 2A in Figure 1a) or finally reaches the ground (e.g., # 3C in Figure 1a) [47]. In addition to the geographic coordinates of each return, PCD in LAS format [48] also records information such as the number of returns (NRs), return number (RN), and intensity of each return, etc.
A set of preprocessing of DALS PCD is required before further analysis, including noisy point removal, point cloud filtering, height standardization, and classification (Figure 1b). These 3D points in DALS PCD can be classified according to NRs and RN relationships (Table 1). Combined with the land coverage type represented by the point, e.g., the ground or vegetation, these points can further be classified into single ground echo ( ), single vegetation echo ( ), first vegetation echo ( ), and last ground echo ( ), etc.

Physical Method
The Beer-Lambert (BL) law relates the attenuation of light to the material properties through which the light travels [49]. The gap fraction model derived from the BL law can be used to describe the relationship between LAI and the gap fraction (GF) of the canopy as follows [12]: where ( ) is the gap fraction at the view zenith angle , is the mean inclination angle of canopy component, and ( , ) is the Ross Nilson G-function, which refers to A set of preprocessing of DALS PCD is required before further analysis, including noisy point removal, point cloud filtering, height standardization, and classification ( Figure 1b). These 3D points in DALS PCD can be classified according to NRs and RN relationships ( Table 1). Combined with the land coverage type represented by the point, e.g., the ground or vegetation, these points can further be classified into single ground echo (SR g ), single vegetation echo (SR v ), first vegetation echo (FR v ), and last ground echo (LR g ), etc.

Physical Method
The Beer-Lambert (BL) law relates the attenuation of light to the material properties through which the light travels [49]. The gap fraction model derived from the BL law can be used to describe the relationship between LAI and the gap fraction (GF) of the canopy as follows [12]: where GF(θ) is the gap fraction at the view zenith angle θ, α is the mean inclination angle of canopy component, and G(θ, α) is the Ross Nilson G-function, which refers to the fraction of a unit plant area projected onto a plane perpendicular to a direction making an angle θ with nadir [50]. From Equation (1), the LAI can be expressed as: Remote Sens. 2021, 13, 2408 4 of 17 GF(θ) and G(θ, α) in Equation (2) can both be estimated by DALS PCD. The GF(θ) generally characterized by the LiDAR penetration indexes (LPIs) derived from DALS PCD. Existing LPIs can be summarized into three types, i.e., point-number-based LPIs (PNB LPIs), point-intensity-based LPIs (IB LPIs), and voxel-based LPIs (VB LPIs) (see Section 3.1.1). Three types of methods have been proposed to estimate the G(θ, α) of canopy (see Section 3.1.2). The process of estimating LAI basing on the GF model is straightforward when LPIs and G(θ, α) are available ( Figure 2). Remote Sens. 2021, 13, x FOR PEER REVIEW 4 of 17 the fraction of a unit plant area projected onto a plane perpendicular to a direction making an angle with nadir [50]. From Equation (1), the LAI can be expressed as: ( ) and ( , ) in Equation (2)   (2) Point-intensity-based LPIs (IB LPIs), where IB LPIs is calculated as the ratio of the intensity of ground points to the total intensity; and (3) Voxel-based LPIs (VB LPIs), which divides the space occupied by PCD into 3D voxels cell by subjectively selecting the sizes of ∆ , ∆ , and ∆ , and the corresponding LPIs is calculated as the ratio of empty voxels that do not contain point clouds to the total voxels [23]. An example of such three types of LPIs is presented in Figure 2. There are more than ten forms of LPIs, and some widely used are listed in Table 2.  (2) Point-intensity-based LPIs (IB LPIs), where IB LPIs is calculated as the ratio of the intensity of ground points to the total intensity; and (3) Voxel-based LPIs (VB LPIs), which divides the space occupied by PCD into 3D voxels cell by subjectively selecting the sizes of ∆x, ∆y, and ∆z, and the corresponding LPIs is calculated as the ratio of empty voxels that do not contain point clouds to the total voxels [23]. An example of such three types of LPIs is presented in Figure 2. There are more than ten forms of LPIs, and some widely used are listed in Table 2.   [52] 1 NS v is the number of returns of pulse in where a ground return was recorded. I * refers to the intensity of certain types of returns (e.g., I SRg is the intensity of single ground return) and I total is the total intensity of ground and vegetation returns. The other variables are the same as those defined in Table 1 and Section 2.

LPIs Form Reference LPIs Form Reference
Previous studies have shown that different LPIs are highly correlated but may exhibit a significant absolute difference from each other [21]. However, the proper form of the LPIs metric is currently unclear, and researchers typically subjectively choose an LPIs form for 1.
Most studies suggested that PNB LPIs are superior to IB LPIs for DALS PCD [33,51]. The use of IB LPIs remains a controversial issue due to the unclear physical meaning of intensity and difficulty in performing intensity correction [1,21,42,53,54].

2.
For PNB LPIs, the LPI SR , derived from the single return, typically underestimates GF due to the lack of sensitivity of a single echo to the small gap within the crown [41,47,55,56], resulting in 20-30% overestimation of LAI [51]. Conversely, LPI SR_LR , which is calculated by single and last return, typically overestimates GF, thus underestimates LAI [33,55]. Weighted LPIs (e.g., LPI SCI , LPI MCI ) are typically plagued with the suitability of the weights for different return types [20,57]. Voxel-based LPIs, however, may be biased due to a wrong cell size selected subjectively [23,36,58].

3.
For IB LPIs, the reflectance of background and vegetation is an important factor that affects the energy from either the background or vegetation [40,59]. Therefore, it is necessary to calibrate IB LPIs for better performance [41,47,60] by introducing the backscattering coefficient of background and vegetation, or at least their ratio (µ).
Given the fact that spectral property of background and vegetation not only affects the reflected energy but also the number of returns from either the background or vegetation and thus in further PNB LPIs, a similar to IB LPIs calibration method was proposed aiming at PNB LPIs in [25]: where LPI adj. is the adjusted PNB LPIs. The µ can be derived from ground measured spectral data [61], remotely sensed hyperspectral data [25], or the intensity of returns [42,62]. However, the estimation of µ still needs further study (see Section 5.2).

Estimating G from DALS PCD
The G function in Equation (2) is a function of the observation zenith angle and leaf orientation [63,64]. For most tree species, the leaf is nearly uniformly distributed in the azimuth direction [65]; therefore, leaf orientation typically refers to the leaf angular distribution (LAD) in the zenith direction [66]. The G function can be expressed in terms of LAD and observation zenith angle as: where A(α, θ) is the Reeve kernel, i.e., the average projection of unit area of the leaf, of angle α and uniform azimuth, in the direction of the probe θ [9,67]. g(α) is the LAD of the canopy. It was almost impossible to directly measure the LAD of the forest canopy with the conventional methods [12,68] due to the forest canopy being tall and intricate [12]. To quantitatively describe the LAD of a canopy, many theoretical models have been proposed, such as the spherical [69], ellipsoidal [64], and beta function models [70]. The G of the canopy that satisfies the spherical model is independent of the observation direction and always equals 0.5 [47,69,71]. Studies on these theoretical LAD models also found that G is approximately equal to 0.5 when the observation zenith angle is approximately 57.3 • , regardless of the LAD [67,72].
Another way to obtain the G function is based on multi-directional GF [12,73]. However, only a few observation angles are available in a small area (e.g., plot), particularly when the platform height was increased to save data acquisition costs [23]. Considering that LAD is relatively stable in a given area compared to the variation in LAI [40], Qu et al. hypothesized that LAD in a larger area that is above the plot, called "tile," can be described by an ellipsoidal model [64] with parameters that were estimated using the nonlinear optimization method [25].
Remote Sens. 2021, 13, 2408 6 of 17 LAD has been successfully extracted from PCD acquired by terrestrial laser scanning (TLS) directly. TLS typically has a smaller footprint (<5 cm) and footprint space interval (<5 cm) [46] and hence a higher point density compared with DALS. Under the assumption that the points in PCD can characterize the leaf's surface, a certain number of nearby points called the "support region" were reconstructed as "leaves." Consequently, the angle between the normal vector of "leaves" and zenith direction was calculated as leaf inclination angle [74,75]. Zheng et al. tried to apply this method to DALS PCD (point density > 20 pts/m 2 ) [23]. Ma et al. also applied the method to DALS PCD to obtain the leaf orientation distribution of the inclinational and azimuthal angles simultaneously [71]. The "occlusion effect," i.e., the upper canopy element, blocks some of the LiDAR pulses completely ( Figure 1) and prevents it from sampling behind upper elements [41,76], but would less limit the application of this method because measuring some samples is enough to represent the LAD of the whole canopy [14]. While the pulse footprint size (>10 cm) and the footprint space interval (>20 cm) [46] of the DALS system is generally larger than the leaf characteristic size, a study has proved that the canopy extinction coefficient (i.e., in Equation (1)) extracted from DALS PCD can also effectively characterize the canopy in some specific view zenith angles [71].
In summary, there are four types of methods to determine the G of a canopy, i.e., the LAD-and observation-direction-independent method (a spherical model), LAD-independent hinge zenith angle (57.3 • ), multi-directional GF-based method, and "leaves" reconstruction method based on LiDAR PCD. However, studies have shown that LAD is strongly dependent on tree species, age [20], and even time [63,77], which implies that a spherical model may not suitable in some cases. Existing LiDAR systems designed to obtain an accurate digital elevation model work on a small scan angle (typically < 30 • ), far away from the hinge observation angle. Moreover, the details of the multi-directional GF-based method still require further verification, such as the size of "tile" and prior knowledge about the constraints on model parameters, and its identical LAD assumption may not be suitable for mixed forest sites. The "leaves" reconstruction method seems a promising one, especially with the DALS increasingly equipped with a higher pulse repetition frequency which theoretically enhances the ability of DALS to sample canopy elements in more detail [78]. Apart from the LAD-independent hinge zenith angle method, the other three methods can be employed for G estimation using DALS PCD (Figure 2).

Empirical Regression Based on Proxy Variables
Empirical models are based on the allometric growth equation model, which assumes that the target canopy structure parameters (e.g., LAI) are related to other canopy structure parameters (e.g., tree height, crown width, etc.) [79]. The proxies assessed in previous studies are mainly constructed based on some heuristic clues and can be categorized into three types, i.e., height-related variables, cover-related, and LPIs (Table 3). A high correlation was found between LPIs [44], cover related metrics [60], and LAI; however, these models constructed based on the number/intensity of different types of returns are difficult to expand between different LiDAR systems even in the same region [21]. The height-related proxies were used frequently; however, a relatively lower correlation was found between such a proxy and LAI [60,80,81] compared with LPIs. The empirical method was also highly sensitive to tree age, time, stand density, the accuracy of field measurement data of LAI, etc. [53,55,57,82]. Overall, all regression models are calibrated and validated against local field data; however, reliable models that can be transferred to other locations, times, and different LiDAR systems are unfortunately lacking [21,53]. Table 3. Widely used proxies for the empirical regression model.

The Validation of DALS PCD-Based LAI
Fang summarized six types of schemes for remote sensing product/algorithm validation [2]. These schemes also outline the process of DALS PCD-based LAI validation, such as field-to-remote sensing direct comparison, validation with model-simulated LAI, and the inter-comparison of multiple LAI products, etc.
The field-to-DALS LAI direct comparison is involuntary, as the DALS can characterize the canopy at a plot scale corresponding with that of ground direct/indirect measurement. The indirect measurement of LAI using LAI-2200 and DHP is the most used data source for DALS PCD-based LAI validation, and a strong correlation was reported [20,77]. There were three problems when such an indirect field measurement was used to validate the PCD ALS-LAI. Firstly, the conical field of view of field measure devices (CFOV) mismatch with the pre-defined field plot size used to extract the PCD subset of field plot (SPCD). The CFOV is generally larger than the pre-defined field plot size, and hence a SPCD that is comparable to the tree height within the plot or a larger one than pre-set plot size is recommended to construct the LAI retrieval model according to the results of previous studies [20,33,39]; Secondly, the inherent difference between DALS and ground measurement devices in characterizing the canopy results in different data representations of the same forest canopies [57,87]. The DALS capture more foliage elements in the upper canopy, while more stems and lower canopy components will be sampled by field indirect measurement devices [43], even compared with the TLS that follows the identical physical basis but different observation configuration [71]. Finally, the quality of the field measurement would distort the process of validation. For example, Zhao tested the uncertainties of field LAI measurement using DHP caused by the analysts or algorithms and found such uncertainties were in the same order of magnitude as or even larger than that of the fitted regression models [33].
On the contrary, radiative transfer models (RTMs), such as FLIGHT [88] and DART [89], etc., are capable of simulating LiDAR data with a controlled canopy parameter, which can exclude the influences from field reference data. Based on the simulated DALS PCD using the DART model, Yin assessed the LAI retrieval model constructed by different LPIs and pointed out that the models based on IB LPIs showed higher accuracy and reliability than PNB LPIs [53]. However, the effectiveness of such a RTMs-based validation is restricted by the fineness of canopy reconstruction [87] and the parameters of the DALS system that is not always available for commercial DALS sensors [90].
The inter-comparison method assesses the temporal and spatial consistency of remote sensing products, which is an important aspect of validation [2,36]. The inter-comparison between the DALS PCD-based LAI and the satellite LAI products, e.g., GLOBCARBON LAI [33], and MODIS LAI [31], suggests that the DALS PCD-based LAI can serve as a reliable reference for validating moderate-resolution LAI products. The misregistration between the DALS PCD and the satellite could severely impact the pixel-wise intercomparison. The consistency between DALS-LAI and GLOBCARBON-LAI was signifi-cantly improved with a correlation coefficient from 0.08 to 0.85 for pine when the misregistration was corrected [33].
As summarized by Wang, LiDAR-based LAI validation was performed at a limited local scale [36], partly due to the poor transferability of the proposed model (see Section 5.1) and the high acquisition cost of DALS PCD. More comprehensive validation is required, namely, but not limited to, across different forests [21] to enhance understanding of and further improve the LAI retrieval models based on DALS PCD.

Model Scalability
The greatest problem with using DALS PCD to estimate LAI is the poor scalability of the model in time, space, and across different DALS systems. Currently, both empirical regression and the physical GF model rely on field data to construct or calibrate model fraught with inaccuracy [43], which reveals that DALS PCD-based LAI retrieval is unreliable for application without calibration by field measurement.
Such a poor transferability of the model has been widely recognized and was ascribed to the LiDAR system (e.g., pulse repetition frequency (PRF), pulse energy, beam divergence, etc.), observation geometry (e.g., platform height, scanning angle, etc.), properties of the ground object (e.g., spatial distribution, moisture content, spectral characteristics, etc.) and hypsography (e.g., slope, relative slope with the incident pulse, etc.). These factors jointly determine the PCD characteristics, such as the size of the footprint, point density, spatial distribution of point and data volume, etc., and also determine the applicable situations to which DALS PCD can be used effectively [46]. Efforts have been paid on the influences from part of these factors on LAI retrieval based on DALS PCD, such as the characteristics of the LiDAR system, namely the scan angle [91], point density [92], sensor altitude [93], etc. Nevertheless, few studies have investigated the properties of the canopy (e.g., spectral characteristics, spatial heterogeneity) and the relationships between various factors, which is necessary for the effective acquisition, autoprocessing, and accurate application of ALS data.
Overall, the disparity reported by the current literature regarding the estimation of LAI from DALS PCD highlights the necessity to figure out the influence of the aforementioned factors. On this basis, more robust models which can be transferred from one case to other are required to fill the gap between the experiment and practical application [21].

Calibration of LPIs with a Target Spectral Property
Equation (3) shows that the spectral properties of background and vegetation are important factors that can lead to the deviation of LPIs from GF. The discrepancy between LPIs and GF caused by spectral properties will increase rapidly when the spectral difference of the background and vegetation increases, particularly with medium canopy cover ( Figure 3). With µ = 5 obtained by Solberg [41], the overestimation of LPIs to GF can reach 79.8% (Equation (3)) if the spectral difference is ignored (i.e., µ = 1).
Three methods can be used to determine µ: (1) calculating µ using ground-based spectral data [47,61] or remotely sensed hyperspectral data [25]; (2) determining µ using a series of trials in a given range (e.g., µ = 1 to 10) and selecting µ that produces the highest coefficient of determination R 2 in a linear regression between retrieval LAI and groundbased measured LAI [41]; and (3) calculating µ using return intensities [42,62,94]. The first two methods require additional ground-based measurements in addition to LiDAR data and thus require more effort to collect auxiliary data, i.e., spectral or LAI data, that are consistent with the LiDAR data in both time and space. If no field spectral data are available, field spectral data collected in the historical database can be used as a reference [61,95]. However, the term "ground" in LiDAR data does not mean "soil" because the classifying threshold of LiDAR to differentiate vegetation and ground is a standardized height [94], which indicates that all points below the given threshold are classified as a ground [96]. This discrepancy partly explains the relatively low correlation of the LiDAR intensity with the field-measured spectrum [97]. It indicates that there is significant uncertainty in using the µ calculated from the field spectral or hyperspectral data for the LPIs correction directly. The third method using returns' intensity from vegetation and ground to calibrate LPIs, cannot accurately separate spectrally independent quantities from return intensities [98,99]. One reason for these difficulties is that the LiDAR manufacturers do not release all LiDAR system parameters, e.g., the intensity quantification algorithm, to end-users due to confidential considerations [21,100]. The missing DALS system parameters limit the radiation correction of point cloud intensity [1,54,101,102], indicating that the methods extracting µ directly from the intensity of returns must be validated. Three methods can be used to determine μ: (1) calculating μ using ground-based spectral data [47,61] or remotely sensed hyperspectral data [25]; (2) determining μ using a series of trials in a given range (e.g., μ = 1 to 10) and selecting μ that produces the highest coefficient of determination R 2 in a linear regression between retrieval LAI and groundbased measured LAI [41]; and (3) calculating μ using return intensities [42,62,94]. The first two methods require additional ground-based measurements in addition to LiDAR data and thus require more effort to collect auxiliary data, i.e., spectral or LAI data, that are consistent with the LiDAR data in both time and space. If no field spectral data are available, field spectral data collected in the historical database can be used as a reference [61,95]. However, the term "ground" in LiDAR data does not mean "soil" because the classifying threshold of LiDAR to differentiate vegetation and ground is a standardized height [94], which indicates that all points below the given threshold are classified as a ground [96]. This discrepancy partly explains the relatively low correlation of the LiDAR intensity with the field-measured spectrum [97]. It indicates that there is significant uncertainty in using the μ calculated from the field spectral or hyperspectral data for the LPIs correction directly. The third method using returns' intensity from vegetation and ground to calibrate LPIs, cannot accurately separate spectrally independent quantities from return intensities [98,99]. One reason for these difficulties is that the LiDAR manufacturers do not release all LiDAR system parameters, e.g., the intensity quantification algorithm, to end-users due to confidential considerations [21,100]. The missing DALS system parameters limit the radiation correction of point cloud intensity [1,54,101,102], indicating that the methods extracting μ directly from the intensity of returns must be validated.

Correcting Impact from the Clumping Effect and Woody Material
Clumping and woody materials are the most important factors estimating true LAI in indirect ground measurements and remote sensing [103,104]. The former may lead to a 30-70% underestimation of LAI [105], and the latter may lead to a 7-40% overestimation of LAI [106].
Clumping can simultaneously result from within-and between-crown and the shape of the crown. Many efforts have been exerted to correct the clumping effect. Proposed methods include, as systematical reviewed by Yan [66], the finite-length averaging method (LX) [107], gap-size distribution method (CC) [105], method combining the two methods above (CLX) [108], and path length distribution method (PATH) [109]. Some of these approaches have been successfully applied to DALS PCD data by transforming 3D data into 2D imagery data, considering the low point density due to the occlusion effect and low sampling density. Mariano had assessed the performance of CC on DALS PCD, and the results indicate that the clumping index of CC showed good agreement with that from DHP (R 2 = 0.71, RMSE = 0.09) [110]. The PATH model has been used for both DALS PCD and TLS PCD, which inherently contains the path length and distribution that the PATH model requires [32,111]. In addition, given that the LiDAR pulse can reasonably be regarded as a systematic sampling of the canopy [32], some statistical methods have also been introduced to estimate clumping index, such as standardized Morisita's index (SMI) [112] and Pielou coefficient of segregation (PCS) [113], indexes widely used in ecological studies to describe the distribution of species. By voxelating the PCD from TLS and DALS and classifying the voxel into different species (i.e., empty or vegetation voxel), Mariano investigated the SMI and PCS on estimating clumping index, and the results showed that the performance of both indexes depends on the size of the voxel [110].
The woody component is another important factor that causes retrieval LAI based on the GF model to deviate from the true LAI. From the spectral perspective, although the intensity of LiDAR returns can be used to identify vegetation types [114,115], it is not easy to distinguish different canopy components using a single band. From the spatial side, the advantage of LiDAR in describes the geometrical characters of the canopy component would be a promising direction to classify out the woody component, however, which is restricted by the size of the footprint and PRF of the existing ALS system [71]. Nevertheless, there is still hope: (1) the multiband LiDAR system was invented long ago [116] and has been commercialized in recent years [117,118], making it possible to identify woody components [77]; (2) with the development of the ALS system and the combination of TLS and ALS [36], the footprint is getting smaller, and the PRF is getting higher increasingly [119]. Therefore, the canopy component identification method developed from TLS PCD [120,121] will be reasonably applicable to ALS PCD to eliminate contribution from woody elements.

Saturation Effect
Although a LiDAR pulse can penetrate a forest canopy to some extent, ALS PCD-based LAI may saturate in some cases. For GF model-based method, the saturation refers to the cases that the pulse cannot penetrate the complete canopy, i.e., a lack of ground return, resulting in a gap rate of 0, which leads to failure of the LAI determination based on the gap fraction model (Equation (2)) [21,51]. For the empirical regression model, the saturation is similar to that of passive remote sensing: the retrieval LAI does not increase with proxies, i.e., vegetation indexes for passive remote sensing or LiDAR metric for LiDAR, when the proxies exceed a certain value [33].
As far as we know, there is no study aiming at the saturation effect of LAI retrieval based on DALS PCD due to the well-known fact that the LiDAR beam does have penetration ability. By summarizing published literature, some of the critical LAI of saturation are listed in Table 4. In passive remote sensing, many researchers have observed that optically-derived vegetation indexes tend to saturation when LAI values are greater than 3 [24,122]. In contrast, LiDAR-based LAI retrieval does have, as expected, a larger critical saturation value of LAI (mean = 4.2 from Table 4); however, it seems condition-dependent, including upon tree species, model type, and LiDAR system used, etc. Interestingly, the point density (or pulse density) does not appear to be, as expected, a critical factor to the saturation effect. Unfortunately, it is hard to draw definite conclusions based on the limited samples we have.
For the GF model-based method, studies have proposed several solutions for saturation, including the method of increasing pulse density using a larger PRF, the method of substituting the infinitesimal nonzero value for zero-gap, the method of increasing the spatial extent to gain a certain amount of ground return, and the method of configuring the LIDAR instrument to be more sensitive to returns with low energy [77]. However, each method has advantages and disadvantages. For example, increasing the spatial extent will cause a lower spatial resolution of the results [56] and difficulty in model verification. Furthermore, higher sensor sensitivity is bound to increase data noise, which complicates data processing.
For the empirical regression model, it seems that a combination of the metrics constructed by geometrical-and intensity-related variable can delay the saturation effect to some extent [60], which can partly ascribe to the increased information from different sources. However, the contribution from more metrics may be limited; a study by Tesfamichael showed that the model constructed by thirteen metrics had ignorable improvement compared with the three-metrics model [123]. Height, cover related metric Empirical~3.5 0.81 11.3 Figure 13 in [85] 1 A single value refers to the radius of a circular field plot, the double values represent the side length of the rectangle plot, and h is the height of trees within a field plot.

Conclusions and Future Direction
As a critical parameter of a forest canopy structure, LAI can be quickly determined at multi-spatial scales using a discrete ALS LiDAR system widely used in forestry investigations. In this review, the methods and existing problems of LAI inversion based on DALS PCD are systematically summarized.
The empirical regression method and gap fraction models are the two dominant methods used to determine LAI using DALS PCD. Tree height and its derivative variables (e.g., percentiles of tree height), LPIs, and canopy cover are widely used proxy variables for empirical regression model construction. The gap fraction model estimates LAI using Beer-Lambert's law using LPIs derived from ALS PCD. The most significant problem with LAI retrieval using DALS PCD, for both empirical regression methods and gap fraction model, is the poor scalability of the model in time, space, and across different DALS systems. Further work is desired to emphasize assessing the transferability of published methods to new geographic contexts, sensor types, and survey characteristics, based on figuring out the influence of each factor on the LAI retrieval process using DALS PCD.
Unlike passive remote sensing that records an integrated signal from reflected energy, LiDAR provides a chance to characterize the canopy in high detail in a vertical direction [21]. Moreover, LiDAR also provides a new avenue for correcting the impacts of the clumping effect and woody material. For example, developing a multiband LiDAR system will make it possible to distinguish woody components in the canopy from a spectral perspective. Additionally, an increasingly smaller footprint and larger PRF will provide more effective data to construct canopy 3D structures accurately and determine the LAD needed for a reliable LAI retrieval [78].
From a methodological point of view, with the development of remote sensing technology, LAI estimation has entered a stage of integrating multisource data [29]. Compared to traditional methods, machine learning algorithms (e.g., neural networks) can fuse more auxiliary information [126], such as spectrum-related data from passive remote sensors and multisource historical data, and have been widely used in the inversion of canopy structure parameters such as LAI [127,128]. However, the training features set from traditional remote sensing methods are primarily composed of the spectrum, spectral index (e.g., NDVI), and 2D spatial structure information, which typically has strong collinearities [129]. However, LiDAR can provide 3D canopy structure information from a new perspective, theoretically providing a new data source for model training [130]. Therefore, using multisource, multiband, multiplatform, homogeneous, and heterogeneous data and taking advantage of machine learning algorithms to describe the inversion of canopy structure parameters should be investigated in future research [130].