Estimation of LAI with the LiDAR Technology: A Review

Leaf area index (LAI) is an important vegetation parameter. Active light detection and ranging (LiDAR) technology has been widely used to estimate vegetation LAI. In this study, LiDAR technology, LAI retrieval and validation methods, and impact factors are reviewed. First, the paper introduces types of LiDAR systems and LiDAR data preprocessing methods. After introducing the application of different LiDAR systems, LAI retrieval methods are described. Subsequently, the review discusses various LiDAR LAI validation schemes and limitations in LiDAR LAI validation. Finally, factors affecting LAI estimation are analyzed. The review presents that LAI is mainly estimated from LiDAR data by means of the correlation with the gap fraction and contact frequency, and also from the regression of forest biophysical parameters derived from LiDAR. Terrestrial laser scanning (TLS) can be used to effectively estimate the LAI and vertical foliage profile (VFP) within plots, but this method is affected by clumping, occlusion, voxel size, and woody material. Airborne laser scanning (ALS) covers relatively large areas in a spatially contiguous manner. However, the capability of describing the within-canopy structure is limited, and the accuracy of LAI estimation with ALS is affected by the height threshold and sampling size, and types of return. Spaceborne laser scanning (SLS) provides the global LAI and VFP, and the accuracy of estimation is affected by the footprint size and topography. The use of LiDAR instruments for the retrieval of the LAI and VFP has increased; however, current LiDAR LAI validation studies are mostly performed at local scales. Future research should explore new methods to invert LAI and VFP from LiDAR and enhance the quantitative analysis and large-scale validation of the parameters.


Introduction
Leaf area index (LAI) is defined as one half the total green leaf area per unit ground surface area [1]. It is listed as an essential climate variable by the global climate change research community (GCOS) and is a critical variable in processes such as photosynthesis, respiration, and interception [2,3]. The field LAI can be measured using direct sampling or indirect optical methods [4][5][6][7]. With a direct sampling method, the LAI can be directly obtained by harvesting vegetation leaves through the collection of leaf litter or destructive sampling [8]. With an indirect optical method, the LAI is estimated from the canopy gap fraction or transmittance using the Beer-Lambert law. The LAI values obtained from ground measurement are often used as references for remote sensing validation. However, these methods are labor-intensive and time-consuming, and the deployment over large areas is difficult.
The LAI estimations from remote sensing data show the most promise for accurate estimations in large scales. Existing techniques can be divided into two main categories, that is, passive optical remote For different platforms, LiDAR can be categorized into three groups: terrestrial laser scanning (TLS), airborne laser scanning (ALS), and spaceborne laser scanning (SLS). The main characteristics of LiDAR systems used for the estimation of the vegetation LAI are presented in Table 1.  In contrast, full-waveform LiDAR systems digitize the entire reflected energy, resulting in complete submeter vertical vegetation profiles. Within a forest environment, the full waveform indicates the forest structure (i.e., from the top through the crown and understory to the ground) ( Figure 1c) [16,26]. In contrast to discrete return and full-waveform systems, photon-counting LiDAR (PCL) systems are unique. They operate based on the concept that a low-power laser pulse is transmitted, and the utilized detectors are sensitive at the single-photon level. Based on this type of detector, any returned photon, whether from the reflected signal or solar background, can trigger an event within the detector [26,28]. An individual photon could be reflected from anywhere within the vertical canopy, the probability distribution function (PDF) of that reflected photon would be the full waveform (Figure 1d).
For different platforms, LiDAR can be categorized into three groups: terrestrial laser scanning (TLS), airborne laser scanning (ALS), and spaceborne laser scanning (SLS). The main characteristics of LiDAR systems used for the estimation of the vegetation LAI are presented in Table 1. Terrain impact, data gaps. [33][34][35] TLS is a ground-based LiDAR technology that acquires 3D details. Typically, these TLS systems emit laser pulses having footprints ranging from 1 cm to 5 cm. TLS systems can be classified into discrete return and full-waveform models by how they record the energy returning to the sensor (Table 1). Discrete return TLS measures the surrounding 3D space using millions to billions of 3D points, which are commonly presented in a point cloud [36]. Most discrete return TLSs record a single range for each laser shot. The TLS point cloud data provide the 3D distribution of canopy at individual tree or stand level and have been widely used for the estimation of the vegetation LAI [20,37]. The full waveform TLS records the light reflected from objects along the laser path, which can be calibrated to power units. Full waveform data can be used to improve the vegetation foliage extraction in forest stands with the fully digitized return pulse [38]. Such data have been used to measure the LAI, foliage profile, and stand height [30,39,40]. Based on the scanning geometry of TLS, near-range objects are more frequently sampled than far-range objects, which may limit its broad application. Figure 2 shows two examples of the TLS data: the point cloud data of a forest sample plot based on single-scan data colored by height (Figure 2a  ALS is deployed on fixed or rotary-wing aircraft, which is a data source for multiscale LAI estimation [42]. ALS covers relatively large areas in a spatially contiguous manner. The difference between TLS and ALS is that the ground laser pulse returns from TLS are limited due to the scanning geometry [43]. ALS systems can be classified as discrete and full-waveform. The ALS point cloud data are acquired at altitudes between 500 and 3000 m using a small laser pulse footprint ranging from 0.1 m to 3 m. The ALS point cloud data are the most common type of data used in forest inventory applications and several system developers and an increasing number of commercial vendors support the acquisition and analysis [44]. The ALS full-waveform data are typically acquired at higher altitudes (up to 20,000 m) using a large footprint ranging from 10 m to 30 m. A common ALS full-waveform data system is the airborne Land, Vegetation, and Ice Sensor (LVIS) [45]. Figure  3 shows the point cloud data of a forest plot (colored by height), with an average point density of 4.3 points/m 2 obtained by an ALS (upper), and the LVIS waveform data with a footprint size of 20 m (lower). ALS is deployed on fixed or rotary-wing aircraft, which is a data source for multiscale LAI estimation [42]. ALS covers relatively large areas in a spatially contiguous manner. The difference between TLS and ALS is that the ground laser pulse returns from TLS are limited due to the scanning geometry [43]. ALS systems can be classified as discrete and full-waveform. The ALS point cloud data are acquired at altitudes between 500 and 3000 m using a small laser pulse footprint ranging from 0.1 m to 3 m. The ALS point cloud data are the most common type of data used in forest inventory applications and several system developers and an increasing number of commercial vendors support the acquisition and analysis [44]. The ALS full-waveform data are typically acquired at higher altitudes (up to 20,000 m) using a large footprint ranging from 10 m to 30 m. A common ALS full-waveform data system is the airborne Land, Vegetation, and Ice Sensor (LVIS) [45]. Figure 3 shows the point cloud data of a forest plot (colored by height), with an average point density of 4.3 points/m 2 obtained by an ALS (upper), and the LVIS waveform data with a footprint size of 20 m (lower).  [46]. The image in the left panel is from Google Earth ® , and the right panel is a waveform return where the upper and lower peaks come from the forest canopy and the ground surface, respectively. SLS is deployed onboard satellites and can be classified into full-waveform and photo counting. Compared with the ALS and TLS, which have insufficient spatial coverage, the SLS is more suitable for forest surveys at global scales [47]. The Geoscience Laser Altimeter System (GLAS) is a spaceborne full-waveform LiDAR system onboard the Ice, Cloud, and land Elevation Satellite-1 (ICESat-1) [48,49]. The GLAS full-waveform data provide the first spaceborne laser altimetry data for Earth observations and have been applied for LAI estimations [33]. Figure 4 shows an example of the GLAS data with a footprint size of 70 m. The follow-on to the ICESat mission, ICESat-2, launched in September 2018, carries the Advanced Topographic Laser Altimeter System (ATLAS), a LiDAR system that utilizes the photon-counting technology [35]. The ICESat-2/ATLAS data have potential for the estimation of the forest canopy cover and LAI [50]. In addition, the Global Ecosystem Dynamics Investigation (GEDI) installed on the International Space Station (ISS) was launched in December 2018 [34]. The GEDI measures the forest canopy height and canopy vertical structure. GEDI provides the raw waveforms (Level 1B) and the LAI and vertical profile data (L2B) from the Land Processes Distributed Active Archive Center (LPDAAC) (https://lpdaac.usgs.gov/data/getstarted-data/collection-overview/missions/gedi-overview/).   [46]. The image in the left panel is from Google Earth ® , and the right panel is a waveform return where the upper and lower peaks come from the forest canopy and the ground surface, respectively. SLS is deployed onboard satellites and can be classified into full-waveform and photo counting. Compared with the ALS and TLS, which have insufficient spatial coverage, the SLS is more suitable for forest surveys at global scales [47]. The Geoscience Laser Altimeter System (GLAS) is a spaceborne full-waveform LiDAR system onboard the Ice, Cloud, and land Elevation Satellite-1 (ICESat-1) [48,49]. The GLAS full-waveform data provide the first spaceborne laser altimetry data for Earth observations and have been applied for LAI estimations [33]. Figure 4 shows an example of the GLAS data with a footprint size of 70 m. The follow-on to the ICESat mission, ICESat-2, launched in September 2018, carries the Advanced Topographic Laser Altimeter System (ATLAS), a LiDAR system that utilizes the photon-counting technology [35]. The ICESat-2/ATLAS data have potential for the estimation of the forest canopy cover and LAI [50]. In addition, the Global Ecosystem Dynamics Investigation (GEDI) installed on the International Space Station (ISS) was launched in December 2018 [34]. The GEDI measures the forest canopy height and canopy vertical structure. GEDI provides the raw waveforms (Level 1B) and the LAI and vertical profile data (L2B) from the Land Processes Distributed Active Archive Center (LPDAAC) (https://lpdaac.usgs.gov/data/get-started-data/collection-overview/missions/gedioverview/).

Data Preprocessing
Preprocessing of the point cloud data includes registration, noise removal, and ground filtering ( Figure 5) [51,52]. During the multiple scans, the data of all scan locations must be registered into a single-point cloud dataset with a common coordinate to clip overlapping points and resolve the point redundancy [53,54]. Two registration methods can be used: the reflector registration and the reflectorfree registration method [55]. Commercial TLS scanners provide reflectors to be placed in each scan. However, this method is limited in forestry applications because few natural tie points can be found. Therefore, artificial reflectors are used as tie points for registration [31]. The placement of artificial reflectors is labor-intensive because of obstructions. Therefore, several reflector-free registration methods have been applied to overcome difficulties associated with the placement of artificial reflectors [56][57][58][59]. After the data registration, noise removal and ground filtering were handled together. The noises in point cloud data, that is, the unreasonably high or low elevation values, are often randomly distributed. The simplest way to identify such outliers is to examine the frequency distribution of elevation values [60,61]. Subsequently, non-ground points and ground are separated by ground filtering methods [61,62]. Filtering methods can be mainly categorized into three groups: slope-based, mathematical morphology-based, and surface-based methods [63]. Slope-based methods are based on the assumption that the change of the slope of the terrain is generally gradual in a neighborhood, while the change of the slope between buildings or trees and the ground is very abrupt and large [64,65]. Slope-based methods are affected by complex terrain. Mathematical morphology-based methods use mathematical morphology to remove non-ground points [62]. The window size is critical for mathematical morphology-based methods. In contrast to slope-based and mathematical morphology-based methods, surface-based methods gradually approximate the ground surface by iteratively selecting ground points from the original dataset. The core of surfacebased methods is to create a surface that is close to the bare ground [55,66]. After filtering the data, ground points are used to produce a digital elevation model (DEM). The height of LiDAR returns is normalized with respect to the corresponding DEM. The relative height above the DEM can be used to remove understory vegetation [37,67] and the height threshold can be used to separate ground returns and canopy returns [23].

Data Preprocessing
Preprocessing of the point cloud data includes registration, noise removal, and ground filtering ( Figure 5) [51,52]. During the multiple scans, the data of all scan locations must be registered into a single-point cloud dataset with a common coordinate to clip overlapping points and resolve the point redundancy [53,54]. Two registration methods can be used: the reflector registration and the reflector-free registration method [55]. Commercial TLS scanners provide reflectors to be placed in each scan. However, this method is limited in forestry applications because few natural tie points can be found. Therefore, artificial reflectors are used as tie points for registration [31]. The placement of artificial reflectors is labor-intensive because of obstructions. Therefore, several reflector-free registration methods have been applied to overcome difficulties associated with the placement of artificial reflectors [56][57][58][59]. After the data registration, noise removal and ground filtering were handled together. The noises in point cloud data, that is, the unreasonably high or low elevation values, are often randomly distributed. The simplest way to identify such outliers is to examine the frequency distribution of elevation values [60,61]. Subsequently, non-ground points and ground are separated by ground filtering methods [61,62]. Filtering methods can be mainly categorized into three groups: slope-based, mathematical morphology-based, and surface-based methods [63]. Slope-based methods are based on the assumption that the change of the slope of the terrain is generally gradual in a neighborhood, while the change of the slope between buildings or trees and the ground is very abrupt and large [64,65]. Slope-based methods are affected by complex terrain. Mathematical morphology-based methods use mathematical morphology to remove non-ground points [62]. The window size is critical for mathematical morphology-based methods. In contrast to slope-based and mathematical morphology-based methods, surface-based methods gradually approximate the ground surface by iteratively selecting ground points from the original dataset. The core of surface-based methods is to create a surface that is close to the bare ground [55,66]. After filtering the data, ground points are used to produce a digital elevation model (DEM). The height of LiDAR returns is normalized with respect to the corresponding DEM. The relative height above the DEM can be used to remove understory vegetation [37,67] and the height threshold can be used to separate ground returns and canopy returns [23]. Preprocessing of the full waveform data includes smoothing, the identification of signal start and end points, and Gaussian decomposition ( Figure 6). Because of the rough shapes of waveforms, the estimation of initial values results in a large number of modes with narrow widths and low amplitudes. Therefore, it is necessary to smooth the waveforms to obtain a smaller number of modes [68]. To identify the signal start and end points, each received waveform is first smoothed using a Gaussian filter with a width similar to that of the transmitted laser pulse [69,70]. A threshold above the background noise level is then used to obtain the signal start and end points. The signal start and end points are the first and last bin locations at which the waveform intensity is larger than the threshold above the mean background noise in the waveform [71]. Different thresholds have been used in the literature including 3 times the standard deviation [70,72,73], 4 times the standard deviation [74], or 4.5 times the standard deviations [75][76][77]. There is no consistent optimal threshold that can be applied to all sites [69]. Finally, a Gaussian decomposition algorithm (Equation (1)) is used to decompose the filtered waveform into a series of Gaussian peaks, where the last peak of decomposed components corresponds to the ground surface [78]: where ( ) is the LiDAR waveform; a least-squared method is used to compute the model parameters; , , and are the amplitude, location, and width of a decomposed Gaussian peak; and n is the number of decomposed Gaussian peaks. The absolute value of is used in Equation (1) to avoid decomposed Gaussian peaks with negative amplitudes. Preprocessing of the full waveform data includes smoothing, the identification of signal start and end points, and Gaussian decomposition ( Figure 6). Because of the rough shapes of waveforms, the estimation of initial values results in a large number of modes with narrow widths and low amplitudes. Therefore, it is necessary to smooth the waveforms to obtain a smaller number of modes [68]. To identify the signal start and end points, each received waveform is first smoothed using a Gaussian filter with a width similar to that of the transmitted laser pulse [69,70]. A threshold above the background noise level is then used to obtain the signal start and end points. The signal start and end points are the first and last bin locations at which the waveform intensity is larger than the threshold above the mean background noise in the waveform [71]. Different thresholds have been used in the literature including 3 times the standard deviation [70,72,73], 4 times the standard deviation [74], or 4.5 times the standard deviations [75][76][77]. There is no consistent optimal threshold that can be applied to all sites [69]. Finally, a Gaussian decomposition algorithm (Equation (1)) is used to decompose the filtered waveform into a series of Gaussian peaks, where the last peak of decomposed components corresponds to the ground surface [78]: where f(x) is the LiDAR waveform; a least-squared method is used to compute the model parameters; a i , µ i , and σ i are the amplitude, location, and width of a decomposed Gaussian peak; and n is the number of decomposed Gaussian peaks. The absolute value of a i is used in Equation (1) to avoid decomposed Gaussian peaks with negative amplitudes.
Remote Sens. 2020, 12, x FOR PEER REVIEW 8 of 29 Figure 6. Preprocessing of the full waveform data.

Gap-Based Methods
The LAI is mainly estimated from LiDAR data by means of the correlation with the gap fraction [6,21,22,30,31,79,80]. The theoretical basis of gap-based methods is the Beer-Lambert law [81]: where ( ) is the canopy gap fraction at zenith angle θ. Based on the above equation, the light attenuation in vegetation canopies can be represented by the light extinction as a function of the LAI [22,82]: where I is the below-canopy light intensity, I0 is the above-canopy light intensity, k is the extinction coefficient, and I/I0 is the fraction of light transmitted through the canopy. However, the gap fraction cannot directly be measured by LiDAR but is derived from LiDAR metrics. Various LPMs are used as proxies for I/I0 to estimate LAI. For point cloud data, the LPM can be calculated as the ratio of the number of ground returns to the number of total returns or the number of sky pixels to the number of total pixels (number-based ratio) and the ratio of the sum of intensity values of ground returns to the sum of intensity values of total returns (intensity-based ratio) [19,83]. For the waveform data, the LPM can be calculated by the ratio of the ground return energy to the total return energy (ground-to-total energy ratio) [18,84,85]. Based on different LiDAR data characteristics, a variety of LPMs can be used for the estimation of the LAI (Table 2).

Gap-Based Methods
The LAI is mainly estimated from LiDAR data by means of the correlation with the gap fraction [6,21,22,30,31,79,80]. The theoretical basis of gap-based methods is the Beer-Lambert law [81]: where P(θ) is the canopy gap fraction at zenith angle θ. Based on the above equation, the light attenuation in vegetation canopies can be represented by the light extinction as a function of the LAI [22,82]: where I is the below-canopy light intensity, I 0 is the above-canopy light intensity, k is the extinction coefficient, and I/I 0 is the fraction of light transmitted through the canopy. However, the gap fraction cannot directly be measured by LiDAR but is derived from LiDAR metrics. Various LPMs are used as proxies for I/I 0 to estimate LAI. For point cloud data, the LPM can be calculated as the ratio of the number of ground returns to the number of total returns or the number of sky pixels to the number of total pixels (number-based ratio) and the ratio of the sum of intensity values of ground returns to the sum of intensity values of total returns (intensity-based ratio) [19,83]. For the waveform data, the LPM can be calculated by the ratio of the ground return energy to the total return energy (ground-to-total energy ratio) [18,84,85]. Based on different LiDAR data characteristics, a variety of LPMs can be used for the estimation of the LAI (Table 2). For point cloud data from TLS, the gap fraction can be calculated by the ratio of the number of sky pixels to the number of total pixels [79]. Two methods are commonly used for the LAI estimation from point cloud data based on the gap fraction theory. One is a two-dimensional (2D) approach and the other is a 3D approach [20]. The main steps of the 2D method are as follows ( Figure 7): (1) The 3D point cloud data are first converted to the spherical coordination system using spherical projection; (2) data from the hemisphere are then converted to a plane; that is, the 3D point cloud data are converted to 2D raster images using geometrical projection; (3) 2D raster images are divided into sky and foliage elements, and the gap fraction is calculated from the ratio of the number of sky pixels to the number of total pixels; and (4) the gap fraction is used to retrieve the LAI. Danson et al. [79] estimated the forest canopy gap fraction from TLS point cloud data based on the 2D method. The results showed that the gap fraction obtained from TLS data is similar to the gap fraction measured in the field. Zheng et al. [20] converted the 3D point cloud data to 2D raster images using two geometrical projection techniques and estimated the effective LAI (LAI eff ). They reported that the stereographic projection-based TLS LAI eff model is more robust than the Lambert azimuthal equal-area projection TLS LAI eff model.  For point cloud data from TLS, the gap fraction can be calculated by the ratio of the number of sky pixels to the number of total pixels [79]. Two methods are commonly used for the LAI estimation from point cloud data based on the gap fraction theory. One is a two-dimensional (2D) approach and the other is a 3D approach [20]. The main steps of the 2D method are as follows ( Figure 7): (1) The 3D point cloud data are first converted to the spherical coordination system using spherical projection; (2) data from the hemisphere are then converted to a plane; that is, the 3D point cloud data are converted to 2D raster images using geometrical projection; (3) 2D raster images are divided into sky and foliage elements, and the gap fraction is calculated from the ratio of the number of sky pixels to the number of total pixels; and (4) the gap fraction is used to retrieve the LAI. Danson et al. [79] estimated the forest canopy gap fraction from TLS point cloud data based on the 2D method. The results showed that the gap fraction obtained from TLS data is similar to the gap fraction measured in the field. Zheng et al. [20] converted the 3D point cloud data to 2D raster images using two geometrical projection techniques and estimated the effective LAI (LAIeff). They reported that the stereographic projection-based TLS LAIeff model is more robust than the Lambert azimuthal equalarea projection TLS LAIeff model.  LAI derived from TLS point cloud data using 2D method is in good agreement with the field LAI data. However, the 3D structural information of the scanned canopy is lost and the gap fraction must be obtained from the original 3D data. In 3D-based methods, the main steps to estimate LAI include ( Figure 8): (1) the point cloud data are voxelized and a voxel-based tree model is produced from the registered point cloud dataset; (2) gap fraction of the layer is calculated by the number of empty voxels and total number of incident laser beams that reach each horizontal layer; (3) the LAI of each horizontal layer is calculated by using the gap fraction in the layer and extinction coefficient; and (4) the LAI of tree is cumulated from the first horizontal layer to the last layer. Takeda et al. [86] divided the TLS point cloud data into horizontal and vertical elements, calculated the gap fraction for each voxel, and estimated the plant area index (PAI). The PAI estimate from TLS is in good agreement with the field data (R 2 = 0.69). Zheng and Moskal [31] proposed a voxel-based algorithm to quantitatively identify the canopy structure and directly predict the LAI eff from TLS. The results showed that the voxel-based method explains 88.7% of the LAI eff of the field measurement. These results show that TLS provides direct, nondestructive estimates of the LAI.
Remote Sens. 2020, 12, x FOR PEER REVIEW 10 of 29 LAI derived from TLS point cloud data using 2D method is in good agreement with the field LAI data. However, the 3D structural information of the scanned canopy is lost and the gap fraction must be obtained from the original 3D data. In 3D-based methods, the main steps to estimate LAI include ( Figure 8): (1) the point cloud data are voxelized and a voxel-based tree model is produced from the registered point cloud dataset; (2) gap fraction of the layer is calculated by the number of empty voxels and total number of incident laser beams that reach each horizontal layer; (3) the LAI of each horizontal layer is calculated by using the gap fraction in the layer and extinction coefficient; and (4) the LAI of tree is cumulated from the first horizontal layer to the last layer. Takeda et al. [86] divided the TLS point cloud data into horizontal and vertical elements, calculated the gap fraction for each voxel, and estimated the plant area index (PAI). The PAI estimate from TLS is in good agreement with the field data (R 2 = 0.69). Zheng and Moskal [31] proposed a voxel-based algorithm to quantitatively identify the canopy structure and directly predict the LAIeff from TLS. The results showed that the voxel-based method explains 88.7% of the LAIeff of the field measurement. These results show that TLS provides direct, nondestructive estimates of the LAI. The LPM can also be calculated as the ratio of the number of ground returns to the number of total returns [19,83,87]. Figure 9 shows the flowchart of LAI estimation based on return number or intensity: (1) the ground and canopy returns are separated according to the height threshold; (2) the gap fraction is calculated as the ratio of the number of ground returns to the number of total returns (or the ratio of the sum of intensity values of the ground returns to the sum of intensity values of the total returns); and (3) the LAI is determined using the gap fraction based on the Beer-Lambert law. The LAI obtained from ALS point cloud data using the number-based ratio is in good agreement with the field LAI [22,23,88]. Aside from the number of returns, the ALS point cloud data intensity is increasingly used in remote sensing applications. Zhao and Popescu [19] used the ratio of the sum of intensity values of the ground returns to the sum of intensity values of the total returns to estimate the LAI. In addition, the forest LAI can be reliably estimated using the LPM based on intensity alone, with R 2 = 0.610 [83]. The LPM can also be calculated as the ratio of the number of ground returns to the number of total returns [19,83,87]. Figure 9 shows the flowchart of LAI estimation based on return number or intensity: (1) the ground and canopy returns are separated according to the height threshold; (2) the gap fraction is calculated as the ratio of the number of ground returns to the number of total returns (or the ratio of the sum of intensity values of the ground returns to the sum of intensity values of the total returns); and (3) the LAI is determined using the gap fraction based on the Beer-Lambert law. The LAI obtained from ALS point cloud data using the number-based ratio is in good agreement with the field LAI [22,23,88]. Aside from the number of returns, the ALS point cloud data intensity is increasingly used in remote sensing applications. Zhao and Popescu [19] used the ratio of the sum of intensity values of the ground returns to the sum of intensity values of the total returns to estimate the LAI. In addition, the forest LAI can be reliably estimated using the LPM based on intensity alone, with R 2 = 0.610 [83]. The full-waveform LiDAR records a continuous distribution of returned energy and can be used to characterize the LAI and vertical LAI profile. For the full waveform, the LPM can be calculated as the ratio of the ground return energy to the total return energy (ground-to-total energy ratio) [18,84,85]. The main steps to estimate LAI based on ground-to-total energy ratio are as follows: (1) the ground and canopy returns are separated using height threshold/Gaussian decomposition; (2) the ground-to-total energy ratio is used to calculate the LPM; and (3) the LPM is used to estimate the LAI ( Figure 10).
The separation of the ground and canopy returns is a key step to calculate the ground-to-total energy ratio. Different separation methods have been applied. A simple method is based on the height threshold; that is, the ground return energy is calculated using the height above the ground and below the height threshold. A height threshold of 2.0 m has been used to separate the ground and canopy when calculating the ground-to-total energy ratio. Based on the method, the GLASpredicted and field-measured LAI are in good agreement [84,89]. However, the height threshold varies depending on the location and species. Therefore, the Gaussian decomposition method is incorporated to derive the LAI and vertical foliage profile (VFP) using LiDAR. Tang et al. [18] used the Gaussian decomposition method [90] to decompose full-waveform data into multiple Gaussian functions. The last peak of all decomposed Gaussian components is assumed to be the ground peak and the last Gaussian fit represents the ground return [78,91]. The gap fraction was calculated using laser energy returns from the canopy and ground and used to estimate the LAI and VFP [15,18,33]. The ability to derive the LAI and VFP facilitates large-scale LAI mapping using LiDAR, and it frees the requirement for associated field data. The full-waveform LiDAR records a continuous distribution of returned energy and can be used to characterize the LAI and vertical LAI profile. For the full waveform, the LPM can be calculated as the ratio of the ground return energy to the total return energy (ground-to-total energy ratio) [18,84,85]. The main steps to estimate LAI based on ground-to-total energy ratio are as follows: (1) the ground and canopy returns are separated using height threshold/Gaussian decomposition; (2) the ground-to-total energy ratio is used to calculate the LPM; and (3) the LPM is used to estimate the LAI (Figure 10).
Remote Sens. 2020, 12, x FOR PEER REVIEW 12 of 29 Figure 10. Estimation of LAI based on the ground-to-total energy ratio.

Contact-Based Methods
The basis of contact-based methods is the contact frequency, calculated as the probability of a beam penetrating the canopy coming into contact with a vegetative element [32,92]. The contact frequency N(θ) between a light beam and vegetation element in the direction (θ) is given by Figure 10. Estimation of LAI based on the ground-to-total energy ratio.
The separation of the ground and canopy returns is a key step to calculate the ground-to-total energy ratio. Different separation methods have been applied. A simple method is based on the height threshold; that is, the ground return energy is calculated using the height above the ground and below the height threshold. A height threshold of 2.0 m has been used to separate the ground and canopy when calculating the ground-to-total energy ratio. Based on the method, the GLAS-predicted and field-measured LAI are in good agreement [84,89]. However, the height threshold varies depending on the location and species. Therefore, the Gaussian decomposition method is incorporated to derive the LAI and vertical foliage profile (VFP) using LiDAR. Tang et al. [18] used the Gaussian decomposition method [90] to decompose full-waveform data into multiple Gaussian functions. The last peak of all decomposed Gaussian components is assumed to be the ground peak and the last Gaussian fit represents the ground return [78,91]. The gap fraction was calculated using laser energy returns from the canopy and ground and used to estimate the LAI and VFP [15,18,33]. The ability to derive the LAI and VFP facilitates large-scale LAI mapping using LiDAR, and it frees the requirement for associated field data.

Contact-Based Methods
The basis of contact-based methods is the contact frequency, calculated as the probability of a beam penetrating the canopy coming into contact with a vegetative element [32,92]. The contact frequency N(θ) between a light beam and vegetation element in the direction (θ) is given by where G(θ) is the projection function that corresponds to the fraction of foliage projected on the plane normal to the solar direction. For TLS point cloud data, the main steps to estimate LAI based on contact frequency include the following ( Figure 11): (1) the point cloud data are voxelized, and a voxel-based tree model is produced from the registered point cloud dataset; (2) the contact frequency of the layer is calculated from the number of intercepted laser beams and total number of incident laser beams that reach each horizontal layer (Equation (5)); (3) the LAI of each horizontal layer is calculated by using the contact frequency of the laser beams in the layer and projection function (Equation (6)); and (4) the LAI of the entire tree is cumulated from the first horizontal layer to the last layer. N(k) = n I (k) n I (k) + n P (k) (5) where N(k) is the contact frequency of laser beams in the kth layer, n I (k) is the number of laser beams intercepted by the kth layer, n P (k) is the number of laser beams passing through the kth layer, and n I (k) + n P (k) is the total number reach the kth layer.
where LAI(k) is the LAI of the kth horizontal layer within the canopy. Hosoi and Omasa [32] developed a voxel-based canopy profiling (VCP) method based on the contact frequency for accurate LAD and LAI estimation using TLS point cloud data. First, they produced a voxel-based tree model, where the voxel is defined as a volume element in a 3D array. In each horizontal layer, the voxel was then assigned a different attribute value. For example, a voxel with attribute 1 indicates that the laser beams are intercepted, and a voxel with attribute 2 reflects that the laser beams pass through. The contact frequency of the layer is calculated by the number of intercepted laser beams and total number of incident laser beams. Finally, the contact frequency of each horizontal layer is calculated from the bottom canopy layer (lowest horizontal layer) to the top canopy layer (highest horizontal layer), and the LAIs of each horizontal layer and entire tree are obtained. The authors demonstrated that the LAD and LAI can be computed by directly counting the contact frequency based on the precise voxel model. The error of the best LAD and LAI estimations was 0.7%. The contact frequency calculated from TLS point cloud data utilizing the voxel-based approach can also be used to estimate different LAI layers in savanna trees [93]. The VCP method is one of the methods used to estimate the LAI and VFP from TLS point cloud data. The method converts point cloud data into a voxel-based 3D model that can reproduce each tree. The voxel model computes LAD and LAI by directly counting the contact frequency for each layer and the whole canopy. However, voxel-based approaches are usually associated with time-consuming data acquisition and registration. In addition, the accuracy of these approaches depends on the voxel size.

Biophysical Regression Methods
The LAI can be estimated from the regression of forest biophysical parameters derived from LiDAR, such as LiDAR height and foliage density metrics [19,94,95]. The main steps are as follows: (1) LiDAR metrics are extracted from LiDAR data; and (2) LiDAR-derived metrics are used to estimate the vegetation LAI using allometric relationships. A multivariate linear regression model [94] and partial least squares regression model [96] utilize the height and foliage density metrics to estimate the LAI of the forest. In addition, LiDAR intensity data are increasingly used in estimation of LAI, and intensity metrics are related to the LAI [25]. The combination of the height and intensity metrics has a higher predictive power [19,83]. Hosoi and Omasa [32] developed a voxel-based canopy profiling (VCP) method based on the contact frequency for accurate LAD and LAI estimation using TLS point cloud data. First, they produced a voxel-based tree model, where the voxel is defined as a volume element in a 3D array. In each horizontal layer, the voxel was then assigned a different attribute value. For example, a voxel with attribute 1 indicates that the laser beams are intercepted, and a voxel with attribute 2 reflects that the laser beams pass through. The contact frequency of the layer is calculated by the number of intercepted laser beams and total number of incident laser beams. Finally, the contact frequency of each horizontal layer is calculated from the bottom canopy layer (lowest horizontal layer) to the top canopy layer (highest horizontal layer), and the LAIs of each horizontal layer and entire tree are obtained. The authors demonstrated that the LAD and LAI can be computed by directly counting the contact frequency based on the precise voxel model. The error of the best LAD and LAI estimations was 0.7%. The contact frequency calculated from TLS point cloud data utilizing the voxel-based approach can also be used to estimate different LAI layers in savanna trees [93]. The VCP method is one of the methods used to estimate the LAI and VFP from TLS point cloud data. The method converts point cloud data into a voxel-based 3D model that can reproduce each tree. The voxel model computes LAD and LAI by directly counting the contact frequency for each layer and the whole canopy. However, voxel-based approaches are usually associated with time-consuming data acquisition and registration. In addition, the accuracy of these approaches depends on the voxel size.

Biophysical Regression Methods
The LAI can be estimated from the regression of forest biophysical parameters derived from LiDAR, such as LiDAR height and foliage density metrics [19,94,95]. The main steps are as follows: (1) LiDAR metrics are extracted from LiDAR data; and (2) LiDAR-derived metrics are used to estimate the vegetation LAI using allometric relationships. A multivariate linear regression model [94] and partial least squares regression model [96] utilize the height and foliage density metrics to estimate the LAI of the forest. In addition, LiDAR intensity data are increasingly used in estimation of LAI, and intensity metrics are related to the LAI [25]. The combination of the height and intensity metrics has a higher predictive power [19,83].

Model Inversion Methods
Physical radiative transfer (RT) models have been implemented to simulate LiDAR data under specific forest stand representations and LiDAR specifications [97][98][99][100][101]. The LAI can be retrieved from LiDAR data using the model inversion method. Sun and Ranson [100] developed a 3D model that successfully simulates full-waveform data by building a 3D forest stand scene, which is divided into cells with specific characteristics. Koetz et al. [102] adapted the model of Sun and Ranson [100] and inverted a 3D LiDAR waveform model to estimate the fractional cover and overstory LAI of a coniferous forest. The LAI estimate agreed well with the field measurements (RMSE = 0.41). Ma et al. [103] used the canopy height derived from LiDAR and canopy structure information derived from the geometric-optical mutual shadowing (GOMS) model to estimate the large-scale forest LAI. Based on the comparison of their results with the field LAI, the highest R 2 values were 0.73. However, their study failed to retrieve the vertical distribution of canopy. Based on the assumption that the gap probability is the reverse of the vertical canopy profile, the vertical distribution of the gap probability can be derived [104], and the LAI and vertical foliage area volume density (FAVD) profile can be directly retrieved from ALS full-waveform data using an RT model [105]. Table 3 summarizes the performance of different methods to estimate LAI from the LiDAR data. Among the gap-based category, the 3D method shows the best performance with R 2 = 0.89 and RMSE = 0.007 because the method could improve the accuracy of gap fraction for each layer and provide information about the light penetration condition within the canopy. In contrast, the accuracy based on return intensity is relatively low, because the LiDAR intensity is affected by many factors, such as laser power, incidence angle, object reflectivity, and range of the LiDAR sensor to the object [106]. The other three categories give moderate estimation accuracy with R 2 > 0.69. The regression methods are relatively simple to apply; however, these methods are not universally applicable and need ground LAI measurements. The voxel-based model not only computes LAD and LAI by directly counting the contact frequency for each layer, but also provides the contact frequency information for the whole canopy. However, the voxel-based approaches are usually associated with time-consuming data acquisition and registration processes. In addition, the accuracy of the approaches depends on the voxel size. The LiDAR RT model could simulate LiDAR data under specific forest stand representations and sensor specifications. However, the model is usually complicated and requires many data inputs.

Methods Advantages and Disadvantages R 2 RMSE References
Gap-based 2D method Based on commonly accepted theories adopted in DHP, easily applicable in practice. Lacks 3D structural information of the scanned canopy. 0.71 1.03 [20] 3D method Improves the accuracy of gap fraction for each layer and provides the light penetration information within the canopy. The voxel resolution directly determines the level of details for the canopy structure.
0.89 0.007 [31] Number-based ratio The penetration metrics are related to LAI, whereas the selection of height threshold and plot size greatly affects the effectiveness of the metrics. 0.70 N/A [23] Intensity-based ratio The intensity metrics are related to LAI. The combination of intensity and other metrics could provide a higher predictive power. However, the LiDAR intensity value needs to be corrected.
0.61 0.66 [83] Ground to total energy ratio An effective method to derive LAI and VFP from large footprint waveforms. Estimation of the canopy vertical structural information is affected by topography. 0.69 0.33 [91] Contact-based Voxel-based method Compute LAD and LAI by directly counting the contact frequency for each layer, and provide the contact frequency for the whole canopy. The methods are usually associated with time-consuming data acquisition and registration processes, and the accuracy depends on the voxel size.

Regression of LiDAR metrics
Approximate LAI from LiDAR metrics, relatively simple to apply. Not universally applicable and need ground LAI measurements. 0.69 0.13 [94] Model inversion LiDAR RT model Simulate LiDAR data under specific forest stand representations and sensor specifications. Complicated and require many data inputs. 0.73 0.67 [105]

LAI Validation
Different schemes that have been used to validate the LiDAR LAI include direct comparison methods, scaling-up strategies, and the intercomparison of multiple products.
Field measurements, typically limited to a point or very small area, are vital because they are the basis for all validation studies. Based on the direct comparison method, field measurements and the LAI from different LiDAR systems are directly compared. The field LAI obtained from destructive sampling was used to validate the TLS LAI and LVIS LAI; the LAI derived from LiDAR and destructive sampling were in excellent agreement [18,32]. In addition, the LAI from digital hemispherical photography (DHP) and LAI-2200 are commonly used to validate the LAI from different LiDAR platforms [19,23,30,107,108]. Because of the spatial scale mismatch between field measurements and remote sensing estimation, it is usually difficult to use this method for global validation.
Based on the scaling-up strategy, the field LAI is scaled up via different platforms for the validation with SLS LAI products, thus bridging the scale differences between the field LAI and the LAI derived from SLS. The TLS provides an additional indirect ground-based technique to estimate the LAI. The LAI derived from TLS can be used as field measurement [109][110][111]. The LAI can be validated using scaling-up strategy at multiple spatial scales through LiDAR remote sensing [91]. First, the ground-based (DHP, LAI-2200, TLS) LAI is related to aircraft observations of the LAI. Then, the ALS observations of the LAI are used to validate the LAI derived from SLS tracks that intersect the aircraft coverage. The upscaling validation method has been widely used in the remote sensing community [112]. However, this method may be affected by several factors. First, ground LAI derived from photos, TLS, and LAI sensors may be inconsistent among themselves. Second, errors are introduced by the scale mismatch between ground field data and ALS. Third, different data sources are based on varying spatial footprints and viewing geometries, which may complicate LAI validation.
Multiple products can be compared to determine the relative quality of land products. The intercomparison method has been used as a proxy to assess the temporal and spatial consistency. The LiDAR-derived LAI values are aggregated to the resolution of the passive satellite LAI products to evaluate all LAI data. The GLOBCARBON [19] and MODIS [113] LAI products have been used to compare with the LiDAR-derived LAI map. The registration between LiDAR and the satellite LAI maps is important because misregistration could severely bias the pixel-by-pixel comparison.
Current validation studies are mostly performed at local scales. The results indicate a significant correlation between airborne LiDAR and the field-derived LAI at the plot scale in a tropical forest, with R 2 = 0.58 and RMSE = 1.36 [96], and a moderate agreement (R 2 = 0.63, RMSE = 1.36) between LVIS and the field-derived LAI at tower footprint scales in tropical rainforest [18]. Based on a large-scale validation method, R 2 and RMSE values of 0.69 and 0.33 were obtained between the LVIS LAI and GLAS LAI at GLAS tracks in the Sierra National Forest [91]. The LiDAR-derived LAI was evaluated using the MODIS LAI product, yielding R 2 and RMSE values of 0.86 and 0.76 in mixed coniferous forest [113]. However, the LAIs derived from ALS or SLS still lack sufficient ground validation and intercomparison validation using existing global LAI products generated from passive optical sensors. The LiDAR also has the capability to provide the LAI vertical profile, from site [18] to regional and continental [15,114] scales. Due to the lack of ground-measured data on the LAI vertical profile of the forest, the LAI vertical distribution map has not been completely validated. Existing validation work is mainly based on limited site or observation tower data [114].

Impact Factors
Several factors should be considered in the estimation of LAI with LiDAR technology ( Table 4). The gap fraction theory is based on the assumption of a random distribution of foliage elements [6]. However, in reality, leaves are generally clumped rather than randomly distributed, and thus the LAI estimation must be corrected by considering the error caused by foliage clumping, which leads to underestimations of the LAI [37]. Therefore, several traditional clumping algorithms based on the gap size analysis theory [115] and gap size distribution method [37] were implemented in TLS.
The vertical distribution profile of the forest CI can also be calculated from TLS and used to correct the clumping effect of the LAI at different heights [116]. These studies proved that the CI can be successfully estimated using TLS [117]. In contrast, the estimation of the CI using ALS and SLS is rare. Yang et al. [78] presented a physical method with the gap fraction model to estimate the LAI by correcting the between-crown clumping in discontinuous forest using GLAS data. Based on the correction, R 2 and RMSE values of 0.83 and 0.39, respectively, were obtained. However, it is difficult to use SLS to quantify the clumping because the CI is highly variable and changes even in the same forest. The footprint size of ALS and SLS is too large for the detection of small gaps. Therefore, the quantification of the clumping effect remains an ongoing task in ALS and SLS.
The height threshold and sampling size of LiDAR data are key factors affecting the accuracy of the LAI estimation [83,118]. However, there is no optimal sampling size and height threshold when estimating the vegetation LAI. In some cases, the height threshold of LiDAR was set to a constant value, such as 2 m, to separate the ground and canopy returns in ALS point cloud data [22] and separate the ground and canopy when calculating a ground-to-total energy ratio in GLAS [84,89]. A threshold of 3.6 m was used to calculate the LPM value in a temperate forest [19]. In other cases, the height threshold of LiDAR was related to the setup height of the ground measurement. The setup height of the fisheye camera (i.e., 1.25 m) was used to separate ground and vegetation returns [119]. For the LiDAR plot radius, a common choice is to use the same LiDAR sampling size as that used to collect field plot data [94,120]. On the other hand, the sampling size of LiDAR is related to the canopy height. A radius size of the LiDAR sampling scale equivalent to the entire forest canopy height [24] and 0.75 times the tree height [17] was used to calculate the LPM value. However, the choice of a variable radius is difficult: optimum values range from 75% to 100% of the canopy height [19,119]. In summary, the accuracy of the LAI estimation using LiDAR strongly depends on the height threshold and sampling size [121]. However, the height threshold and sampling size are site-specific, and there are no clear guidelines for the determination of an appropriate value. Table 4. Factors affecting the LAI estimation from LiDAR data.

Factor Description Mitigation Method Advantages and Disadvantages References
Clumping effect Leaves are not randomly distributed but clumped, which may cause the underestimation of the LAI.
Estimate CI and make clumping correction.
The CI can be successfully estimated using TLS, whereas it is rare to estimate CI from ALS and SLS. [37,78,115,116] Footprint size The relative contribution of the vegetation and terrain signal of waveforms is different under different footprint sizes.
Study the influence of footprint size by LiDAR model and obtain optimal footprint size.
A large footprint size contains sufficient gaps for the detection of the underlying ground. However, the ALS and SLS footprint sizes are too large for the small gap detection. [34,122] Height threshold and sampling size The height threshold is critical for the separation of ground and vegetation returns. The sampling size affects the LiDAR and field LAI comparison.
Set the height threshold similar to the ground measurement, and the sampling size the size of field plot.
The accuracy of the LAI estimation is highest with the optimal height threshold and sampling size. However, the optimal value is site-specific. [17,19,24,119,121] Occlusion Vegetation elements intercept laser beams and stop them from being in contact with further material along the path.
Acquire data from multiple TLS scans, or combine TLS and ALS data.
Easy to eliminate blind regions and overcome the occlusion effects. However, it will increase the measurement work and the data size. [32,92] Topography The slope can blur the boundary between vegetation and ground return and affects the accuracy of the canopy vertical structure estimation.
Filter out larger slopes, or compensate the terrain effect using slope-adaptive waveform metrics.
Can compensate the terrain stretching on the forest waveform. However, the performance decreases for complex terrain. [71,77,91,110,123] Types of return LiDAR returns are from different canopy layers; using all returns is more effective than using only the first and last returns in deriving LPM.
Calculate LPMs using all returns.
Increases the effective pulse density and the sensitivity to smaller gap sizes. However, the method is not applicable for LiDAR intensity data. [22,87,88,95,124,125] Voxel size Different voxel sizes significantly affect the gap fraction and LAI estimation.
Determine voxel size based on the minimum element size of the object, or based on the TLS characteristics.
Can obtain higher LAI accuracy with the optimal voxel size. However, it is difficult to determine the optimal voxel size, which depends on many factors. [92,126,127] Woody material Woody materials (i.e., stems and branches) may lead to the LAI overestimation.
Joint use of leaf-on and leaf-off LiDAR data; or make use of the geometric and radiometric features in TLS.
Foliar and woody materials can be effectively separated using TLS. However, the classification method used in TLS is not applicable to ALS and SLS.
The relative contribution of the vegetation and terrain signal of waveforms is different under different footprint sizes [71]. Pang et al. [122] studied the effects of footprint size on the precision of canopy height estimates by means of simulation. They found that footprints with a diameter between 25 m and 30 m would be ideal to level the effects of vegetation height and terrain slope on waveform length. Milenković et al. [131] studied the influence of footprint size on the precision of forest biomass estimates from spaceborne waveform LiDAR. They recommend an optimal footprint size that is similar to the size of the field plots, i.e., 20 m diameter in their case. Dubayah et al. [34] demonstrated that 25 m is the optimal footprint size for GEDI. The suggestion is that the size is large enough to capture the entire canopy of larger diameter trees and small enough to limit the vertical mixing of vegetation and ground signals caused by surface slope.
Occlusion plays an important role in the spatial distribution of the density of the point cloud [92,132]. Occlusion effects are caused by material intercepting laser beams and stopping them from being in contact with material along their path, which results in a relatively lower inner-canopy point density and underestimation of the LAI [43]. Occlusion effects typically occur when single-scan LiDAR is applied to complex forest canopies. The addition of scans is an effective strategy to alleviate the occlusion [133]. Multiple scans can be used to obtain detailed information on the 3D distribution of the canopy, leading to an increase in the amount of data for forests and contributing to the solution of problems associated with a short zenith range [126,127,134]. However, multiple scans significantly increase the field work and size of the datasets, and biophysical parameter estimations cannot be directly compared with other instruments. The combination of TLS and ALS data is another strategy that may solve occlusion effects for very dense crowns because the profiles obtained by ALS and TLS complement each other, eliminating blind regions and yielding more accurate LAD profiles than by using each type of LiDAR alone [135].
Topographic effects can lead to significant errors in the vertical distribution of the plant area, with an RMSE up to 66.2% [134]. The slope affects the accuracy of the vertical LAI distribution because it can blur the boundary between vegetation and ground signals in a LiDAR waveform. The ground peak gradually decreases and finally disappears as the terrain slope increases, making their separation difficult and potentially leading to errors in the LAI and VFP estimates. To identify the accurate ground return and minimize the slope-induced error, Yang et al. [123] extended the geometric optical and radiative transfer (GORT) vegetation LiDAR model to consider the effect of the surface topography. The slope effect on waveforms was then assessed using model simulations. Filtering out a larger slope is an effective method to reduce the effect of the terrain [136,137]; for example, GLAS footprints were filtered using a cutoff slope threshold of 20 [91]. Although significant progress has been made regarding the use of waveforms on slopes below 20 • , the effect of the terrain slope on the estimation of forest parameters has not been thoroughly addressed, especially for steep slopes (i.e., >20 • ). Wang et al. [71] proposed slope-adaptive waveform metrics for the estimation of the forest aboveground biomass (AGB). They used a model to calculate waveforms of bare ground with known terrain slopes to compensate the terrain effect. However, the model performance decreases for complex terrains including bumps or pits. The terrain conditions are always simpler for small footprints than for larger footprints. The waveform LiDAR with a small footprint may limit the vertical mixing of vegetation and ground signals caused by the slope [122].
LiDAR returns can be reflected back from different layers of canopies, e.g., crown surfaces, inside or below top crowns, and the ground [19]. For various LAI estimation methods, it is important to note that the LPM is sensitive to the gap fraction regardless of the gap size. The penetration rate based on the first return might be insensitive to small gaps because the ALS footprints are too large to penetrate such gaps [87]. Heiskanen et al. [124] reported that the LPM utilizing the first and last returns to derive the ALS penetration rate is more sensitive to the gap fraction variations because the returns from different part of canopy are considered. Therefore, the LPMs calculated from both the first and last returns may be the best predictors of LAI [95]. Richardson et al. [22] further proved that the ALS penetration rate calculated from the first and last returns strongly correlates with the field LAI eff estimates (R 2 > 0.9). Hence, the LPM utilizing the first and last returns is recommended for LAI estimation.
The voxelization procedure involves the specification of the voxel size, which has a marked impact on the LAI estimation, because it significantly affects the gap fraction estimation [126]. Different voxel sizes have been used to produce voxel-based tree models. Van der Zande et al. [92] defined voxel size based on the minimum size of the object element, which ensures a number of point cloud data in each voxel. However, several small gaps between leaves can easily be overlooked. More recently, voxel size has been defined based on TLS characteristics [127]. For example, the voxel size is defined depending on the range and scan resolution of the TLS [32]. Grau et al. [138] assessed the effect of the voxel size on LAI estimation using a simulation framework based on the discrete anisotropic radiative transfer (DART) model. They found that voxel size is site-specific, and to obtain good accuracy of estimation, voxel size should be varied in different forest scenes.
Because woody components are sources of error in indirect LAI estimations, the separation of foliar and woody materials is crucial for the accurate estimation of the LAI [139]. The quantitative evaluation of the contribution of non-photosynthetic canopy components to the LAI will be helpful to improve the LAI estimation accuracy. A possible approach is the joint use of leaf-on and leaf-off LiDAR data. The wood area index (WAI) is generally estimated during leaf-off periods and subtracted from the total plant area index. First, the effective woody area index (WAI eff ) is estimated under leaf-off conditions, and the effective PAI (PAI eff ) is estimated under leaf-on conditions. Subsequently, the LAI eff is obtained by subtracting WAI eff from PAI eff [110]. However, this method is not suitable for evergreen vegetation such as coniferous forests. Another approach is based on the use of leaf-on LiDAR data and makes use of the geometric and radiometric features. Geometric features (linear features, random features, and surface features) have been utilized to quantitatively evaluate the contribution of woody material to the LAI using TLS point cloud data [128,140], and the shape, normal vector distribution, and structure tensor of TLS data features have been used to separate various tree organs [129]. Zhu et al. [130] proposed an adaptive radius near-neighbor search method to accurately separate foliar and woody materials in a mixed forest using TLS point cloud data. They reported that the use of a combination of radiometric and geometric features outperforms either one of them alone, yielding an average overall accuracy of 84.4% for mixed forests. However, due to the lower point density and bigger footprint of ALS and SLS relative to TLS, the classification method used in TLS is not applicable to ALS and SLS. The quantitative evaluation of the contribution of non-photosynthetic canopy components to the LAI of a forest stand must be focused on in the future.

Future Prospects
The major advantage of LiDAR technology is its capability to characterize the vertical vegetation structure at different heights [15]. LiDAR-derived LAIs have been used in the validation of the passive satellite LAI products [19,112]. We expect the use of LiDAR LAI will increase with the growing availability of high-quality LAI data derived from LiDAR. Different LiDAR data provided by the different lidar systems have been used to estimate LAI. However, there is no universal LiDAR metric for LAI estimation; therefore, the selection of proper LiDAR metrics is crucial for LAI estimation. More field measurements and novel LiDAR metrics are necessary for improved LAI estimation in the future.
The ALS observations act as a validation link between field and satellite data [91]. However, the relatively high cost of ALS flight mission has significantly limited its applications. As an alternative platform for ALS, the unmanned aerial vehicle (UAV) costs less but can provide denser points. Therefore, UAV provides an effective platform for LAI estimation [141] and acts as a validation link between field and satellite data [21]. The TLS provides an additional indirect ground-based technique to estimate the LAI [109][110][111]. However, TLS data acquisition is highly time-consuming and labor-intensive. A new backpack LiDAR system was developed for efficient and accurate forest inventory applications, and the derived LAD fits well with the TLS estimates (R 2 > 0.92, RMSE = 0.01 m 2 /m 3 ) [142]. A backpack LiDAR system may provide an alternative platform for TLS data acquisition.
The increasing availability of LiDAR data will greatly enhance the LAI estimation. Fusion of multiple LiDAR data from different systems, platforms, and temporal observations is also a continued research direction [14].

Conclusions
In this study, the LiDAR technology, LAI retrieval and validation methods, and factors affecting LAI estimation were reviewed. The use of LiDAR has become an operational data collection option for the retrieval of the LAI. All TLS, ALS, and SLS systems can provide the LAI and VFP.
LAI is mainly estimated from LiDAR data by means of the correlation with the gap fraction and contact frequency, and LAI is also estimated from the regression of forest biophysical parameters derived from LiDAR, such as LiDAR height and foliage density metrics. The TLS provides detailed information on the within-canopy structure at individual tree or stand levels and accurate LAI and VFP distribution within plots. However, TLS data processing is complex and the spatial coverage of TLS is limited. The ALS covers relatively large areas in a spatially contiguous manner and provides multiscale LAI estimation. However, the description of the within-canopy structure is limited because the penetration of the ALS point cloud data is poor in dense vegetation. The SLS provides information regarding the canopy structure with near-global coverage and thus has the potential to produce the global LAI and VFP. However, the accuracy of LAI estimation based on a large-footprint waveform is affected by the terrain. In addition to the limitation of the LiDAR instrument itself, several factors should also be considered in the estimation of LAI with LiDAR technology. Clumping and wood are common factors for all LiDAR systems. The LAI estimation is also affected by occlusion, and voxel size for TLS; sampling size, height threshold and sampling size, and types of return for ALS; and footprint size and topography for SLS. Quantification of these factors remains an ongoing task for LiDAR LAI estimation.
Direct comparison methods, scaling-up strategies, and the intercomparison of multiple products are used to validate the LiDAR LAI. The results show that the LAI derived from LiDAR and reference data were in good agreement. However, current LiDAR LAI validation studies are mostly performed at local scales, such as limited site or observation tower.
Remote sensing techniques have provided powerful and effective tools for estimating the spatial distribution of LAI for large areas. The LAI and VFP have been produced at a large scale using ICESat-1/GLAS, whereas the operation of ICESat-1/GLAS was stopped in 2009. The ICESat-2/ATLAS and GEDI are expected to provide vegetation structure information and large-scale LAI estimates. The usage of LiDAR is expected to increase based on the capability to provide the LAI vertical profile. Future research should explore LiDAR remote sensing inversion with respect to the LAI and VFP, quantitative analysis, and the large-scale validation of the LAI and VFP.