A Density-Based Approach for Leaf Area Index Assessment in a Complex Forest Environment Using a Terrestrial Laser Scanner

Forests are an important part natural ecosystems, by for example providing food, fber, habitat, and biodiversity, all of which contribute to stable natural systems. Assessing and modeling the structure and characteristics of forests, e.g., Leaf Area Index (LAI), volume, biomass, etc., can lead to a better understanding and management of these resources. In recent years, Terrestrial Laser Scanning (TLS) has been recognized as a tool that addresses many of the limitations of manual and traditional forest data collection methods. In this study, we propose a density-based approach for estimating the LAI in a structurally-complex forest environment, which contains variable and diverse structural attributes, e.g., non-circular stem forms, dense canopy and below-canopy vegetation cover, and a diverse species composition. In addition, 242 TLS scans were collected using a portable low-cost scanner, the Compact Biomass Lidar (CBL), in the Hawaii Volcanoes National Park (HAVO), Hawaii Island, USA. LAI also was measured for 242 plots in the site, using an AccuPAR LP-80 ceptometer. The frst step after cleaning the point cloud involved detecting the higher forest canopy in the light detection and ranging (lidar) point clouds, using normal change rate assessment. We then estimated Leaf Area Density (LAD), using a voxel-based approach, and divided the canopy point cloud into fve layers in the Z (vertical) direction. These fve layers subsequently were divided into voxels in the X direction, where the size of these voxels were obtained based on inter-quartile analysis and the number of points in each voxel. We hypothesized that the intensity returned to the lidar system from woody materials, like branches, would be higher than from leaves, due to the liquid water absorption feature of the leaves and higher refectance for woody material at the 905 nm laser wavelength. We also differentiated between foliar and woody materials using edge detection in the images from projected point clouds and evaluated the density of these regions to support our hypothesis. Density of points, or the number of points divided by the volume of a grid, in a 3D grid size of 0.1 m, was calculated for each of the voxels. The grid size was determined by investigating the size of the branches in the lower portion of the canopy. Subsequently, we ftted a Kernel Density Estimator (KDE) to these values, with the threshold set based on half of the area under the curve in each of the density distributions. All the grids with a density below the threshold were labeled as leaves, while those grids above the threshold were identifed as non-leaves. Finally, we modeled LAI using the point densities derived from the TLS point clouds and the listed analysis steps. This model resulted in an R2 value of 0.88. We also estimated the LAI directly from lidar data using the point densities and calculating LAD, which is defned as the total one-sided leaf area per unit volume. LAI can be obtained as the sum of the LAD values in all the voxels. The accuracy of LAI estimation was 90%, with an RMSE value of 0.31, and an average overestimation of 9% in TLS-derived LAI, when compared to feld-measured LAI. Algorithm performance mainly was affected by the vegetation density and complexity of the canopy structures. It is worth noting that, since the LAI values cannot be considered spatially independent throughout all the plots in this site, we performed semivariogram analysis on the feld-measured LAI data. This analysis showed that the LAI values Remote Sens. 2019, 11, 1791; doi:10.3390/rs11151791 www.mdpi.com/journal/remotesensing Ni PAIi = −ln( ), Ni + Ii Remote Sens. 2019, 11, 1791 2 of 21 can be assumed to be independent in plots that are at least 30 m apart. As a result, we divided the data into six subsets in which the plots were 30 m spaced. The R2 values for these subsets, based on modeling of the feld-measured LAI using leaf point density values, ranged between 0.84–0.96. The results bode well for using this method for effcient, automatic, and accurate/precise estimation of LAI values in complex forest environments, using a low-cost, rapid-scan TLS.


Introduction
The spatial distribution of foliage in a forest canopy controls energy transfer to the lower plant components and also the ground [1], the interception of precipitation [2], photosynthesis [3], and other phenomena. As a result, leaf properties and canopy structure are important inputs in eco-physical models of forest environments [4]. Leaf Area Index (LAI) is an indicator for predicting photosynthetic primary production, evapotranspiration and crop growth [5]. LAI typically is defned as the vertically integrated one-sided area of leaves per unit ground surface area on a horizontal plane [6], and traditionally is measured through destructive sampling. Since LAI cannot be easily measured through non-destructive methods, effective LAI (LAIe) is more commonly measured in the feld and calibrated to true LAI [6]. The difference between LAI and LAIe is that effective LAI does not differentiate between branches and foliar components. LAI effectively has been linked to vertical laser pulse return profles using light detection and ranging (lidar). One elegant method is to calculate the ratio of the number of returns below the canopy to the total number of returns, and assume that this is related to gap fraction (P) of the canopy [7]: where N total is the total number of returns and N bc is the number of returns below some height above the ground surface (usually between 1 and 3 m). It has also been shown that this formula does not always provide a direct estimate of the gap fraction, but the ratio shown in Formula (2) can be used for LAI estimation when using airborne lidar data [7]: where k is the extinction coeffcient. An assumption of random or spherical leaf distribution is typically used, which results in the k value of 0.5 [8]. In reality, k values range from 0.25-0.75 [9]. An alternative voxel-based approach for gap fraction estimation was presented in [10]. The authors sampled a plot with an area of 20 m by 40 m with ffteen Terrestrial Laser Scanning (TLS) scans. These scans were then co-registered, providing a dense point cloud. The point cloud was partitioned into different voxels and the number of points were counted in each voxel. The authors estimated Plant Area Index (PAI) for two different point clouds, namely leaf-on and leaf-off. PAI was calculated using the following equation: (3) where N i is the number of laser beams passing through voxel number i, and I i is the number of returns within the voxel i. The plot level PAI was then obtained by taking the average of all the voxel PAI values. LAI was estimated as the difference between the PAI measurements for leaf-on and leaf-off point clouds. The resulting LAIs were in the range of 1.4-1.8, which is signifcantly lower than the LAI values obtained via traditional manual methods (LAI = 2.7-8.2).
Another approach to assessing LAI using TLS is by estimating Leaf Area Density (LAD) [11]. LAD can be described as the total one-sided leaf area per unit volume [12] and the integration of a LAD profle vertically yields the LAI [13]. LAD can be estimated using direct, semi-direct, or indirect methods on-site [14]. The direct approach includes manual counting and measuring of leaves, while semi-direct methods count the contact of a leaf with probes inserted into the canopy [15]. Indirect techniques involve the application of passive optical devices, based on a gap fraction method [15]. However, the mentioned methods are limited in accuracy and in the spatial explicitness of their estimates [14]. As a result, the extracted 3D structures from lidar data can be used for measuring LAD [11]. One such manual method for fnding the contact frequencies in the canopy is presented in [15], which is known as the inclined point quadrat method. In this method, a probe is pierced into the canopy at known heights and angles, and then the number of contact points with the foliage components are tallied. This method estimates the probability that a laser pulse penetrating the canopy will contact a foliage component. In Equation (4), N denotes the contact frequency, θ v and φ v represent elevation and azimuth angle respectively, and H is the height. G is the projection function denoting the unit mean projected foliage area, and l(h) is the height-dependent LAD: For simplifying this equation, typically one assumes that the LAD and projection function are independent of height. Thus, we can have: where LAD is the height-independent LAD. A challenge of this method is that it is diffcult to measure this quantity [12]. One more convenient approach can be describing the gap fraction based on the same terms, with the assumption of a random spatial distribution for small leaves: In later efforts, [16] included the clumping effect and with the assumption of azimuthal symmetry, provided Equation (7) for gap fraction: where Ω describes the clumping index and ΩLAD provides the effective LAI (LAI e ). The mentioned projection function depends on the leaf normal orientation distribution. However, by restricting the penetration zenith angles to 57.5 • , the computation of the leaf normal orientation distribution can be neglected [15]). These authors have shown that G(θ v ) at the angle of 57.5 • equals 0.5. Effective LAI then can be described with this model: where LAI e is effective LAI and N(57.5 • ) denotes the number of contact points with vegetation at a zenith angle of 57.5 • . TLS, on the other hand, provides a fne spatial or angular resolution, thereby allowing the internal regions of the canopy to be evaluated from a ground perspective, making LAD estimation more accurate [17]. One of the important considerations of LAD estimation is the ability to describe the spatial distribution of leaves, separate from the woody canopy structures [14]. LAD estimation is highly affected by whether the leafy and woody parts are separated in the point cloud, and many studies have used manual techniques for leaf classifcation (e.g., [17]), although these approaches are time consuming. However, the intensity of the refected pulse often can be used to distinguish a leaf from woody structures in the canopy [14], even though the intensity is affected by the range and incidence angle. Geometric methods have also been used for separating leaves and branches in TLS data [18], but obtaining geometric attributes of leaves and canopy-level structures is challenging, and is impacted by the angular scan resolution of the system, the pulse frequency, etc. [19]. We therefore opted for a density-based approach for segmenting the leaves in TLS point clouds, since such a method is more robust against variable system characteristics and can be applied more routinely; this approach will be described in detail in the Methods section.
As mentioned earlier, LAI is calculated by the sum of the LAD measurements in vertical layers of the forest canopy: In a more recent study, Shihua et al. [11] used a voxel-based method for estimating the LAD of an individual tree using point cloud segmentation of TLS data. They differentiated between the leafy and woody materials using the normal differences of the two. They also partitioned point cloud into sub-regions to improve the accuracy of the leaf extraction. It is important to note in this context that such point cloud segmentation approaches can be classifed into three main categories, namely edge-detection, region-growth (blob-growth), and hybrid methods.
Edge-detection techniques fnd the breaks in the surface of the point cloud, whereas the region-growth methods detect the continuous surfaces that are similar in geometric properties, and the hybrid methods combine the two mentioned approaches [20]. Normals of the point cloud can be used in segmentation of unorganized 3D point clouds. In short, the more the surface structure changes in adjacent areas, the more the normals of these areas are oriented differently. Figure 1 shows an illustration of the difference in normal distributions in leaves and branches. The normal differences of the leaf point cloud is generally smaller than those from non-leaf materials [11], and for a point, p, in the point cloud, it is measured by the following equation: where n(p) is the normal of the point cloud in point p, and r is the average spatial distance between the two leaves [11]. The leaf point cloud is extracted using the magnitude of Δn and compared to a threshold. One common way for fnding the threshold is using the Otsu algorithm [21]. Furthermore, for this voxel-based method of LAD assessment, the point cloud is segmented into layers in the X direction, and LAD is calculated in all those layers using a voxel-based canopy profling (VCP) method [17]. LAD between heights h and h + Δh is calculated using the following equation: (11) where θ is the zenith angle of the lidar beam, Δh is the layer thickness, m h and m h+Δh denote the voxel coordinates on the vertical axis, corresponding to heights h and h + Δh, and n 1 (k) and n p (k) indicate the number of voxels containing leaf points and excluding, respectively. Thereby, n 1 (k) + n p (k) gives the total number of incident laser beams in the k-th layer. α(θ) is defned as: α(θ) is a correction factor of the leaf inclination angle at the laser zenith angle of θ. G(θ) is the mean projection of a unit leaf area on a plane perpendicular to the direction of the laser beam. A symmetrical shape for the leaves is assumed for measuring of G(θ) . It is defned as: where θ L indicates the leaf inclination angle, φ L is the azimuth angle of the normal, and n B and n L are obtained using the following equations: n L = (sinθ L cosθ L , sinθ L sinφ L , cosθ L ). (15) n B and n L are unit vectors showing the direction of the lidar beam and the direction of the leaf normal, respectively. The TLS used in this study provides lower-density data, due to lower angular resolution and pulse frequency, compared to commercial higher-cost systems. The minimum angular step-width of CBL is 4.36 mrad, which result in lidar point clouds with lower associated point density when compared to higher cost systems, for which the minimum angular step-width can be as small as 0.02 mrad [22]. These low-density data introduce challenges to automatic leaf segmentation algorithms. We therefore evaluated the ability of this arguably more operational system (low-cost; rapid-scan) to automatically extract leaf points, and propose an approach for estimating leaf density and LAI, using LAD assessment, at the plot level. This approach can be used for plot-level LAI assessment in complex forest environments, using such a low-cost, portable TLS, which can reduce the time and cost of data collection, and evaluate canopy-level structural attributes that cannot be measured in feld due to the complex structure of these plots. This algorithm is especially helpful when there is no access to specifc geometric attributes and parameters of the canopy-level structure, e.g., distance between the leaves, due to low-density data, or lack of relevant feld measurements.

HAVO Site
The TLS and LAI feld data for this study were collected in the Hawaii Volcanoes National Park (HAVO). This site has a latitude of 19 • 25 0 27.59 00 N and a longitude of −155 • 15 0 15.30 00 W. Unique species have evolved in this site, including Polystichum munitum, Myoporum sandwicense, Acacia koa, and etc. [23]. The 50 m × 100 m fetch area at the local fux tower site was divided into 22 rows and 11 columns with 5 m spacing forming 242 plots [24]. Figure 2 shows images from the HAVO site. It can be seen that this is a complex forest environment due to the complicated vegetation structures and variations.

Leaf Area Index (LAI) Measurement
In all of the 242 plots, LAI was measured using an AccuPAR LP-80 (METER Group, Pullman, WA, USA) instrument. Four measurements were recorded for each plot in the four cardinal directions, since the LAI is highly dependent on the spatial and illumination characteristics of the scanned location. Consequently, the average of the four measurements was considered as the LAI value for each plot. The AccuPAR LP-80 is a linear Photosynthetically Active Radiation (PAR) ceptometer. Its function is based on the amount of intercepted light in the canopy, which in turn can be used to estimate the LAI. The Accupar LP-80 has a probe consisting of 80 sensors, spaced 1 cm apart along that probe. The PAR is measured in the wavelength range of 400 to 700 nm, and has the units of micromols per meter squared per second (µmolm −2 s −1 ). This instrument can be used as a hand-held or even unattended device. The device was held at approximately breast height (DBH level; 1.3 m above ground), i.e., slightly lower than the scanner, meaning that the LAI recorded by the Accupar is a result of vegetation cover above the DBH-level only, and should correspond with the lidar data collected via the CBL, at approximately the same height above ground. A second Accupar LP-80 was used to log available PAR at one minute intervals during the same period as the feld collection effort; this instrument was mounted on a fre lookout tower approximately 500 m from the feld site and was located above the forest canopy. The recorded values from two instruments were then synchronized based on recorded time. The following formulas are used for measuring LAI using this instrument (METER Group, 2018): (16) where: τ denotes the fraction of transmitted PAR (ratio of below to above the canopy measured PAR), f b is the fraction of incident beam, K represents the extinction coeffcient, and a is the leaf absorption in the PAR band (the default value is 0.9). Table 1 shows the specifcations of the Accupar LP-80.

The Compact Biomass Lidar (CBL)
The lidar data used in this study were collected using a low-cost, portable terrestrial laser system called the Compact Biomass Lidar (CBL) (LMS151, SICK AG, Waldkirch, Germany). This TLS system is a fast-scanning scanner with a 905 nm laser pulsing at 27 kHz, and a 0.25 degree angular resolution. Such low-cost sensors overcome the limitations of TLS in forest structural attribute evaluation, e.g., high cost, low mobility, and long scan time [25]. This system allows for effcient and fast sampling, but with a lower angular and point density resolution compared to higher-cost scanners. The lower resolution introduces challenges to the structural evaluation algorithms, and, as a result, applicable algorithms should be robust to low-density point cloud data. The sensor used in the CBL is a SICK LMS-151 scanner. The sensor is attached to a rotation stage, which rotates through 180 • , and provides a 270 • × 360 • coverage of the hemisphere above the scanner and a portion of the hemisphere below (a 90 • cone directly beneath the scanner remains unscanned; [22]). The height of the scanner in this study was set at slightly above breast height, in terms of diameter-at-breast-height (DBH; 1.3 m above ground). A maximum of two returns are collected for each lidar pulse from the system. The specifcations of the CBL are presented in Table 2 [26].

Point Cloud Denoising
The frst pre-processing step involved noise reduction the TLS point cloud data, since the complicated structure of this forest resulted in many noisy returns in the point clouds. For example, the polar scan location right above the scanner is always highly oversampled, due to the various 270 • scans intersecting at this polar region of the hemisphere, as the scanner rotates through 180 • for the full 270 • × 360 • scan. We therefore downsampled the point clouds using an algorithm based on the spherical sampling of the data points, in order to avoid any bias in our data. This technique uses a range-based weighting function, i.e., smaller weights for closer and higher weights for further removed scan locations [27]. Statistical Outlier Removal (SOR) was used as an additional step during the noise reduction process [28]. SOR calculates the distance of each point to its neighbors and removes those that are further away. The distance between the points is compared, based on the average distance between points in the point cloud. The number of neighbors chosen in this research was fve, which was obtained based on the density of the points in the vegetation structures we needed to maintain, and it allowed us to preserve these structural components and remove most of the noise points. Next, we used a connected component labeling algorithm for detecting more dense structures and removing irrelevant points, especially in the lower canopy where the structural complexity and the plant/point density were higher. Connected component labeling, which is also known as blob or region extraction, is an algorithm application of graph theory, where segments of connected components are labeled similarly [29]. Connected component labeling detects connected regions in images, and can be used in 3D form for point clouds. This algorithm segments the point cloud into smaller parts that are separated by a minimum distance; in this study, we used a grid size of 0.15 m for our segmentation. This value was chosen based on the point density of the vegetation structures from the lower portion of the point cloud, like stems and ferns. Connected component labeling was effective at removing the irrelvant (noise) lidar points, with the resulting segments including the actual forest structures of interest. This stage was critical because, even after SOR was applied, there were still some noise points remaining in the areas where the point cloud was dense and contained more complex, sub-canopy level, and arguably irrelevant structures, in terms of the study's objective (LAI estimation).

Detecting the Canopy
Detection of the higher canopy was achieved via normal change rate assessment. As mentioned earlier, the more the surface structure changes in adjacent points in the lidar point cloud, the more their normals are oriented differently, or the more variable these normals become. Figure 3 shows the result of the segmentation of the higher canopy in one of the scans. It is important to mention that, in some of the plots, due to the very complex vegetation structures (e.g., irregular stem or branch shapes), we needed to manually remove some of the points that were incorrectly segmented as higher canopy. This step of the algorithm can in future research be improved by deriving more geometric features for differentiating the canopy from other segments of the point cloud. The image on the left shows the original point cloud after noise reduction, and the image on the right shows the result of normal change rate assessment. It can be seen that the higher canopy is segmented from the lower portion of the scan, where the main part of the trunks and some of the lower ferns are located.

Finding Voxel Sizes
After segmenting the higher canopy regions of the plots, using the previously mentioned normal change rate assessment technique, and in order to perform the voxel-based approach for LAD measurement, we divided the canopy point cloud into layers in the Z direction. Each layer then was divided into other layers in the X direction, in order to decrease the computational time and also the confusion for our algorithm, which can be caused by excessively large numbers of points and a broad density distribution. The number of layers in the Z direction were chosen to be fve because this number provided us with a narrower range of density distributions in each layer, where the difference between the maximum and minimum point density was much lower than that for the whole canopy point cloud. This reduces the computational time and confusion for our algorithm. As a result, we can state that: where Δh is the height of each layer, z max is the maximum height of the points in the canopy, and z min is the minimum height of the points segmented as the canopy. Voxel size is a critical parameter that can affect the performance of the algorithm. We observed that larger voxels in the areas with a higher number of points can lead to an underestimation of the number of leaf points, whereas smaller voxels in regions where there are fewer points can lead to an overestimation of the density in these areas.
The mentioned factors can introduce extreme outliers into our model. As a result, we determined the voxel sizes in the X direction based on the number of points in each voxel and by using inter-quartile analysis. In each plot, we used the layer containing the highest number of points for fnding the voxel sizes. A third degree polynomial, shown by the following equation, was ftted to the density distribution of the layer: where x denotes the X coordinate values of the points. We then used interquartile analysis in order to fnd the initial size of the voxels. The second quartile (Q 2 ) is the median of the polynomial function. The frst and third quartiles (Q 1 and Q 3 ) are the medians of the lower and upper half of the dataset, respectively. The Inter-quartile Range (IQR) is described follows: We then calculated the density values that are half the IQR further from the median of the density distribution, Q 2 : After calculating these density values, we determined their corresponding X values. We call the X value at the median of the density, x median , and x 1 and x 2 are the X values at D 1 and D 2 , respectively. The distance between these X components are described with the following equations: We introduce our initial Δx as the minimum of these two distances because we want our voxels to be smaller in the areas that contain a large number of points: As a result, our frst two voxels have the size of Δx, while the starting point would be x median . It should be noted that, since the third degree polynomial may not be monotonous in the range of 0.5 × IQR from the median of the density distribution, there can be another condition, i.e., other than a monotonous distribution in that range that has to be considered, where there is a relative maximum in the range of 0.5 × IQR from the median. In this case, one of the D 1 or D 2 values would be the same as Equation (21) or (22), depending on which side of the median the maximum is located. To obtain the other X value, we calculate the difference in the density values of the median and the maximum (D max ), and then fnd the density value equal to 0.5 × IQR − (D max − Q 2 ). Afterwards, we determine the corresponding X value at this density, either higher or lower than the X at the maximum, depending on whether the maximum is on the right or left of the median, respectively.
After determining the width of the frst two voxels, we found the number of points in each and introduced the threshold used for detecting the voxel sizes, as the average of these two voxels: (26) where N 1 and N 2 are the number of points in the frst two voxels and the Nth is set as the threshold used for fnding the voxel sizes in the X direction. The next step involves calculating the width of the voxels based on this threshold, meaning that each voxel should contain a number of points equal to Nth. It is important to mention that, in the far ends of the point cloud, where the point density is signifcantly lower, if the number of points were less than one third of the threshold (Nth), we merged that region with the previous one, thereby forming one voxel. This is due to the fact that a small number of points result in an excessively narrow and inaccurate density distribution. Table 3 shows the statistics for the voxel sizes. Table 3. Voxel sizes statistics.

Measuring the Volume Density
The backscattered power (intensity) from woody materials in the shortwave-infrared (SWIR), i.e., the spectral region roughly between 1400-2500 nm, is higher than the power returned from leaves, due to the liquid water absorption by leaves [30]. The CBL operates at a wavelength of 905 nm, and, although the difference in intensities in foliar and woody materials at 905 nm is not as noticeable as in the SWIR region, there is still a higher intensity recorded for the stems and branches. Additionally, the leaves in the higher canopy of the HAVO site, visible in the lidar point cloud, are mostly fat and horizontally oriented. The scanned area of the leaves decrease dramatically further away from the scanner, multiplied by the cosine of the scan angle. However, due to the cylindrical shape of the branches and their relatively vertical structure, their scanned area is relatively consistent along the scene. The angular distribution of some of the detected branches was found to be between 72 • to 121 • , while the angular distribution of the leaves, which was derived from facets of the manually-detected leaf point clouds, was found to be between 11 • and −23 • [31]. This shows that the leaves are more horizontally oriented in this scene. As a result, we can assume that the total scanned area of the leaves is lower than for the stems and branches, especially in regions further from the scanner.
We also incorporated the images resulting from the intensity images of the point clouds, recorded by the scanner, in order to detect some of the branches and leaves accurately. We were able to detect points that were defnitively branches and others that were leaves, purely by detecting the intensity image edges and "outlining" the original image. This was done using the Canny edge detection algorithm [32]. In this approach, a Gaussian flter frst is applied to the image to reduce noise, after which the intensity gradient of the image is produced. Non-maximum suppression is applied afterwards to remove the incorrectly detected edges; this algorithm operates by fnding the main edges and removing detected edges that do not exhibit a strong connectivity to the main edges. We then measured the density of the detected foliar and woody regions. This was done for twenty plots and 40 segments of leaves and branches in each plot. We found that the density of the branches was higher than that for the leaves by an average order of magnitude of 10 in all of the 20 selected plots. x These results support our hypothesis that one will observe a higher lidar point density for branch regions, when compared to foliar materials in TLS point clouds. Finally, we calculated the volume density, i.e., the number of points in a known volume, in each of the canopy voxels. The frst issue is the grid size, which is used for measuring the density. We manually investigated 110 random plots from the original 242 plots. The average width of the biggest branch in the lowest portion of the higher canopy was found to be 0.09 m. Consequently, we chose the grid size to be 0.1 m on each side. In other words, in each of the voxels, the volume density in a 3D grid with the size of 0.1 m was measured.

Detecting Leaf Points
The next stage constitutes a seminal step in the process, namely the segmentation of leaves from branches, accomplished by setting a threshold for the density values in each voxel. We thus ftted a kernel density estimator to the density histogram in each voxel. In probability theory and statistics, Kernel Density Estimation (KDE) is a continuous and non-parametric approach for estimating the probability density function of a random variable [33]. KDE is used when no assumptions are made regarding the form of the data distribution, which is the case in our study. KDE can be estimated using the following formula: (27) where n is the number of points, K is the kernel, and h is the bandwidth, which is a smoothing parameter in the KDE formula. The density threshold was obtained based on half of the area under the KDE curve. Figure 4 shows an example from one of the distributions and the threshold chosen, based on the area under the curve. The curve on the right shows the threshold found for this distribution. The orange area is the area containing the leaf points' density values.
The points that have lower density values than the threshold are labeled as leaves, while those points above the threshold were labeled as non-leaves (woody material). We modeled the LAI using these leaf point densities, with the results presented in the next section.

KDE Bandwidth Selection
The bandwidth, or bin size, is an important parameter when using KDE. The bandwidth can be estimated using Silverman's (1986) rule of thumb. The formula for the bandwidth derived with this method is presented below: where n is the number of points and σ is the standard deviation of the density distribution.
Silverman's rule of thumb is calculated with an assumption of a normal distribution, which is unsuitable for our data. As a result, we reduced the constant value from 1.06 to 0.5 in order to avoid the over-smoothing caused by the normal density assumption. The bandwidth we used in this study therefore is obtained via the following equation: Figure 5 shows the difference between the two KDEs, one obtained with Silverman's rule of thumb, and the other one with the modifed bandwidth. Figure 6, representing the leaf vs. woody segmentation, shows the result of leaf detection in one voxel using the KDE thresholding.

LAI Estimation Using LAD
We also calculated the LAI directly from our detected leaf points using the LAD measurement. LAD is calculated as the total one-sided leaf area per unit volume. In the last step, LAI is calculated as sum of the LAD measurements in the layers in the Z direction: In our case, since we have discrete values for different height intervals, we can state: where i denotes the number of layers. Figure 7 shows the workfow/diagram of the methods used in this study. The results for modeling the LAI based on (i) the point densities and (ii) the direct derivation of LAI using the LAD estimation are presented in the next section.  Sens. 2019, 11, 1791 14 of 21

Results
We followed two approaches in evaluating our algorithm; frst, we modeled the feld-measured LAIs using the derived leaf point densities, and, second, we directly estimated the LAI using the detected leaf points and LAD assessment. In the case of direct estimation of the LAI values obtained from our TLS data analysis, we ran the algorithm on all 242 plots and achieved an average accuracy of 90%, ranging from 79% to 97%. Accuracy in this case is calculated by directly comparing the feld measured LAI and the derived LAI from TLS data in each plot. The following equation is used for assessing the accuracy: (32) where the FieldLAI represents the feld-measured LAI using the Accupar LP-80, and the LidarLAI is the TLS-derived LAI value in each plot. The RMSE value for LAI assessment was 0.31, with an overestimation of 9% for the average lidar-derived LAI, compared to the feld-measured LAI. We also found the accuracy was lower in the plots where the canopy structural complexity is higher, i.e., the vegetation density is higher in the canopy, especially in the lower portion of the canopy which decreases the lidar beam penetration to the higher levels of the canopy. We refer to "complexity" in this scenario in terms of canopy "density", i.e., expressed via the number of lidar pulses returned at specifc vertical strata throughout the canopy, and their inherent impact on transmission of laser pulses to higher strata; specifc examples are provided in the Discussion section. This issue introduces challenges to the leaf detection algorithm and associated LAI estimation. The obtained accuracy shows that this approach can be used for estimating LAI values in a complex forest environment, especially since an error <10% generally is acceptable in natural resource studies (see [34]).
In the case of modeling the LAI using the leaf point densities, we achieved an R 2 value of 0.88. The leaf point densities in each voxel in this case were calculated as the mean of the densities of the points labeled as leaves. Figure 8 shows the linear model obtained from this analysis. However, this latter analysis assumed no spatial auto-correlation between the 5 m-spaced scans, which necessitated an additional step to evaluate the robustness of our approach.
We next performed semi-variogram analysis to evaluate the spatial dependence between the feld-measured LAI values, since the feld-measured LAI arguably cannot be considered spatially independent across all of the 242 feld plots. A semi-variogram is measured for two points separated at a distance of h using the following formula [35]: (33) where M is a spatial point in the data, V is the geometric representation of the data, and f (M) is the value at that point, which in our work is the feld-measured LAI. Figure 9 shows the result of the semi-variogram analysis.
We found that the LAI values are spatially independent for plots spaced more than 30 m apart, based on the semi-variogram analysis. Consequently, we divided the 242 plots into six different data sets, each random iteration containing the plots that had a 30-m between-plot spacing. The LAI values in each of these sets were modeled using the derived leaf densities, and the R 2 values ranged from 0.84-0.96; Figure 10 shows the six linear models. The linear models for these subsets of data show that the algorithm performs well for the regions, even where the LAI can be considered spatially independent.

Discussion
The presented approach provides a method for automatic leaf detection and LAI assessment, an important parameter in structural evaluation of vegetation, for what could be considered complex forest environments. This algorithm reduces the time and cost associated with data collection for LAI assessment, and provides the ability to assess canopy-level structural attributes, which can be manually impractical due to the structural complexity of many forest environments. This study therefore contributes to plot-level assessment of canopy structural attributes, such as LAI and LAD, using a low-cost, portable TLS, and it does so at high accuracy and precision rates. While previous studies have shown that the higher-density TLS data can provide accurate estimations of LAI, using geometrical attributes of the canopy and directional gap fraction assessment [36,37], the lower-density data used in this work, in terms of lower angular resolution of the scanner and lower associated lidar point density, reduce the cost and time of data collection. However, low-density data make the derivation of geometric traits of the leaves and branches diffcult and introduce associated challenges to structural assessment via TLS data. As an example, Antonarakis et al. [36] used a directional gap fraction approach to estimate LAI using TLS in three forests which consisted mostly of commercial hybrid poplars. While this method proved to be effective when using high-density data, low-density data do not allow an accurate assessment of canopy gap fraction, due to an inability to capture geometric characteristics of the intra-canopy structures, i.e., leaves and branches, and to evaluate their impact on the radiation transferred through the canopy. Our approach thus uses only one variable, namely lidar point density, to detect leaves, and to evaluate LAD and LAI. The lidar point densities are obtained using statistical analysis of the canopy lidar point cloud, without the need of additional geometric and structural parameters. In addition to the issues caused by low-density data, the structural complexity of the studied tropical forests, i.e., the vegetation density and associated lidar point density in the canopy at different vertical strata, introduces challenges to plot-level LAI assessment, when compared to evaluating LAI using a controlled laboratory experiment (e.g., [38]), or a single tree scan (e.g., [39]). Li et al. [39] developed a point cloud slicing approach based on the canopy gap fraction, and were successful in estimating LAI using a single-tree scan. Although an approach like the one used in our work may not provide results as accurate as the ones obtained using methods based on high-density data and single tree scans, our method estimates plot-level LAI, thus reduces the time and cost of data collection, which results in more practical applications of TLS in forest inventories. In another study, Olsoy et al. [40] used TLS to model LAI in 42 Wyoming big sagebrush shrubs, using canopy cover, the percentage of the total ground area which is covered by the vertical projection of the tree crown, and canopy volume as TLS variables. They acquired an R 2 value of 0.73 for modeling LAI using canopy cover and an R 2 of 0.78 when modeling LAI using the canopy volume obtained based on a convex hull approach. Such a method can be effective in assessing LAI using small-range TLS data and single tree scans, while in our work we addressed these issues by developing a density-based approach for assessing the plot-level LAI, acquiring R 2 values in the range of 0.84 to 0.96. Additionally, the accuracy of canopy volume estimation can decrease when the structural complexity and vegetation density increases, and also in further ranges from the scanner, due to lower penetration rate of lidar pulses in those regions and lower associated lidar point density [41]. TLS data and voxel-based approaches, like the one presented in our work, also have been used for LAI assessment in relatively recent studies. Greaves et al. [42] used a voxel counting approach for LAI estimation in low-stature (<1.5 m tall) Arctic shrub species. In this method, the TLS point cloud is divided into voxels, and then the number of voxels that are occupied by at least one point are counted and the number of these voxels per unit ground area provides the LAI estimate (R 2 = 0.94 for close range (2 m) TLS data). While this approach is successful in LAI estimation of low-stature species, TLS data used in our work are from tropical forests with the maximum TLS point cloud height ranging from 13 m to 18 m. As a result, such a method, solely based on voxel counting, can in fact lead to PAI, instead of LAI, due to the inability to differentiate between leaves and branches. Zheng et al. [43] found that the nonphotosynthetic canopy components, e.g., branches, contributed from 19% to 54% to LAI estimation depending on canopy vegetation densities. Therefore, the density-based approach presented in our work for detecting leaves in low-density data increases the plot-level LAI estimation accuracy in forest environments with dense vegetation, such as tropical forests.
We did, however, observe an overestimation of 9% for the average lidar-derived LAI, compared to the feld-measured LAI. This slight discrepancy was partly attributed to the fact that the feld-measured LAI is highly dependent on the spatial characteristics of the location where the measurement is collected, while the TLS-derived LAI depends on the overall canopy-level spatial and structural attributes. Although the algorithm proved effective in achieving its objective, the performance is affected by the structural complexity in the canopy point cloud, specifcally vegetation density. It was found that the accuracy of LAI estimation decreases, coupled to increases in vegetation density, especially in the lower part of the canopy. These more complex, higher density lower-canopy regions effectively reduce the penetration rate of lidar pulses to the higher regions of the canopy, thereby resulting in associated lower LAI assessment accuracy. As an example, in the plot where we obtained the lowest LAI estimation accuracy, 79%, the point density of the lowest layer in the Z direction was 2.21 times the second lowest layer, and the density in the second lowest layer was 1.74 times the third one. This results in a distinct decrease in lidar point density in different height levels of the canopy, which is a result of lower lidar beam penetration due to higher vegetation density in the canopy, whereas, in the plot in which we acquired the highest accuracy of LAI estimation, 97%, the point density of the lowest layer in Z direction was 1.19 times the second lowest layer, and the point density of the second lowest layer was 1.22 of the third one. This shows that, in such plots, there is a higher point density throughout the whole canopy point cloud, which results in a more accurate estimation of LAI. To address this issue, future research can incorporate a fusion of airborne lidar and TLS data to mitigate the impact of lower-canopy structural complexity on lidar pulse penetration throughout the canopy, and thereby hypothetically increase the accuracy of such LAI assessment approaches.
We also showed that a robust approach, i.e., one that incorporates spatial auto-correlation considerations, did not result in signifcantly poorer LAI assessment performance. In fact, our full 242 plot set R 2 value of 0.88 for LAI modeling represented the range of R 2 values for the spatially-uncorrelated sets (R 2 = 0.84-0.96). This implies that, in practice, one could optimize or improve LAI assessments by evaluating the spatial auto-correlation of canopy structure variables, and then develop statistically-reliable, valid estimates for the main parameter of interest (LAI). It also could be argued that such an approach improved estimates, i.e., we achieved R 2 values as high as 0.96, by constraining our modeling to spatially-independent regions within the forest. We therefore recommend that future efforts also include an investigation into how such a spatial statistics approach applies to potential predictor variables, such as leaf density, gap fraction, etc., over and above the dependent variable (LAI).
Future research, over and above the previously-mentioned topics, can include the use of a dualor multi-wavelength scanner in order to better differentiate foliar and woody materials and thereby improve algorithm performance. If the scanner wavelength is located in the SWIR spectral region, e.g., 1520 nm, we expect to see a better separation between the leaf and non-leaf (woody) segments. This is due to the larger difference in moisture-impacted refectance values of these two materials in the SWIR wavelength range. As an example, Zhao et al. [44] used the dual wavelength Echidna lidar (DWEL; [30]), which collects data in two different wavelengths (1064 nm and 1548 nm), to measure the effective LAI. Such a system typically enables better discrimination between refectance values of woody and foliar materials in the canopy, due to its ability to collect lidar data in the SWIR spectral region. The Echidna also yields higher density point clouds, with an angular step width of approximately 1 mrad, compared to the discrete lidar system used in this work, the CBL, with a minimum angular step width of 4.6 mrad.

Conclusions
We presented an algorithm for modelling and estimating the LAI in a complex, tropical forest environment using lidar point densities, derived from TLS data. The data used in this study were from a portable (3.2 kg), rapid-scan (33 s), low-cost (USD 20,000) TLS system. We assumed that the intensity (or power) returned to the scanner from woody materials is higher than for the leaves at 905 nm, and supported this hypothesis by investigating the point cloud density, both in foliar and woody segments. We acquired an accuracy of 90%, and RMSE value of 0.31 in directly estimating the feld-measured LAI, using LAD assessment, and R 2 values of 0.84-0.96 for modeling the feld-measured LAI with leaf point density values, for spatially-uncorrelated plot locations. An overestimation of 9% was found in the TLS-derived LAI compared to the feld-measured LAI. This was attributed to the fact that TLS-derived LAI considers overall canopy-level structural attributes, while the feld-measured LAI is sensitive to the spatial characteristics (angular canopy interactions, illumination conditions, etc.) of the specifc location where the LAI is measured. The data in this study were from an arguably complex forest type, which makes each step of this approach more challenging. However, this algorithm works well in predicting the LAI using just one variable, namely leaf point density. We also considered the fact that the LAI values may not be spatially independent across the whole scene. As a result, by using semi-variogram analysis, we found that this approach works well for spatially-independent LAI values as well. This algorithm works well for automatic detection of leaf points in lower-density TLS lidar data. However, when higher-density (small angular resolution, high point density) lidar data are available, the geometric information of the leaves and branches potentially can be used to improve the leaf detection algorithm.
This algorithm can be used for evaluating LAI in complex forest environments and when derivation of the geometric information and structure from the data are not practicable, due to low scanner angular resolution (and associated point cloud densities) or limited structural retrievals of the scanned site. Finally, such a rapid-scan, fne-scale approach to LAI assessment could prove useful to the larger community when it comes to calibration/validation of landscape-to-regional scale products, derived from airborne-and spaceborne platforms.