^{1}

^{*}

^{2}

^{3}

This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (

Light Detection And Ranging (LiDAR) in forested areas is used for constructing Digital Terrain Models (DTMs), estimating biomass carbon and timber volume and estimating foliage distribution as an indicator of tree growth and health. All of these purposes are hindered by the inability to distinguish the source of returns as foliage, stems, understorey and the ground except by their relative positions. The ability to separate these returns would improve all analyses significantly. Furthermore, waveform metrics providing information on foliage density could improve forest health and growth estimates. In this study, the potential to use waveform LiDAR was investigated. Aerial waveform LiDAR data were acquired for a New Zealand radiata pine plantation forest, and Leaf Area Density (LAD) was measured in the field. Waveform peaks with a good signal-to-noise ratio were analyzed and each described with a Gaussian peak height, half-height width, and an exponential decay constant. All parameters varied substantially across all surface types, ruling out the potential to determine source characteristics for individual returns, particularly those with a lower signal-to-noise ratio. However, pulses on the ground on average had a greater intensity, decay constant and a narrower peak than returns from coniferous foliage. When spatially averaged, canopy foliage density (measured as LAD) varied significantly, and was found to be most highly correlated with the volume-average exponential decay rate. A simple model based on the Beer-Lambert law is proposed to explain this relationship, and proposes waveform decay rates as a new metric that is less affected by shadowing than intensity-based metrics. This correlation began to fail when peaks with poorer curve fits were included.

Light Detection and Ranging (LiDAR) is a widely used active remote sensing method. The majority of applications of LiDAR use pulsed small-footprint laser light to determine the distance to a target with great precision. When laser beam reflections are positioned with respect to a known reference frame (typically using post-processed data from an integrated Global Positioning System and Inertial Measurement Unit, as well as range and scan angle measurements), 3-dimensional models of the target can be generated. For ground mapping applications, generally only one laser is used, with a wavelength that is reflected by the target and transmitted by the surrounding medium, so the output data are geometric but not spectral. Often LiDAR data are fused with multi-spectral imagery to provide additional spectral information, which may be used to identify features in the data such as land cover or object type. When the land cover is multi-layered (e.g., in forests), each pixel integrates all detectable reflected light from all layers, so discrimination of overlaying vertical layers with multi-spectral and hyper-spectral imagery is impossible.

Within forested landscapes, it is useful to be able to determine which returns came from which layer. A layer will comprise a type of cover, whether ground, understorey, coniferous canopy, deciduous canopy etc. Often the layers will be vertical bands that may intermingle and overlap. Returns from the ground are useful for creating Digital Terrain Models (DTMs) which are used in forestry for road and harvest planning, erosion control and hydrology. Sithole and Vossleman [

Similarly, several authors have used the vertical distribution of LiDAR returns to determine tree heights [

Some authors such as Reitberger [

Foliage is where photosynthesis occurs, and hence the amount of foliage has a close relationship to the rate of biomass increase [

LiDAR return signals are a function over time determined by the transmitted waveform, the distance to the reflecting surface(s), the surface response, the spatial beam distribution and the response of the measurement system [

Discrete-return systems suffer from a sizable ‘blind spot’ following each detected return, during which no other returns can be detected [

A common use of waveform LiDAR is post-processing the waveform data to identify proximal peaks that would otherwise be treated as one [

In this study we investigate whether waveform shape can be used to distinguish returns from foliage, understorey and ground in a forest environment, and whether the foliage returns hold information on the local foliage density. Intensity based metrics—peak height and width—are trialed which are likely to be affected by occlusion, along with the waveform decay rate which is expected to be less affected by shadowing. Returns from foliage, wood and ground have been differentiated in terrestrial waveform LiDAR (such as in the ECHIDNA project, e.g., [

Waveform LiDAR data were collected over a radiata pine (^{2} were obtained over the sample plots. Whilst the conventional Optech ALTM 3100EA LiDAR system collected discrete return information, the waveform digitizer simultaneously recorded the waveform of the same laser pulses. Raw GPS data and discrete LiDAR information were processed with Optech’s proprietary data-extraction software REALM into a Corrected Sensor Data (CSD) file. The waveform data were measured at 1 ns intervals and provided as five swathes in Optech’s NDF binary format with an IDX index file. The CSD file was subsequently read by the authors (using Matlab) to obtain discrete return information, as well as positioning information that could be used to georeference each waveform sample in the NDF file using an adapted version of the Matlab code in [

This forest was selected because Beets

Two things immediately distinguish the waveform over forest from the waveform over open pasture:

The forest waveform consists of multiple peaks (returns)

The peak intensities (in arbitrary units defined by the hardware) of the forest waveform are less than the open pasture.

These two observations are common, but do not unambiguously distinguish a return from forest cover from a return from pasture. Many forest waveform traces may yield only one peak, and in many cases the peak amplitude of the forest return may be greater than that from pasture. There is a general similarity between the peaks in both traces that is driven by the shape of the outgoing pulse. If we remove the dominance of the outgoing pulse from the return signal then more information can be learned about the target. Also of note in the figure is the almost exact similarity of the outgoing waveforms, and the similarly in width of the outgoing pulse and the return from pasture. It is odd that the return waveform in this case has a stronger intensity than the outgoing pulse (which if genuine would imply that extra energy were somehow added into the system), but this is in fact due to the way in which the outgoing pulse is sampled and the reflected and outgoing intensity units should not be considered equal.

Jutzi and Stilla [

The Weiner deconvolution filter can be used to find

Once we have obtained a deconvolved waveform (using the outgoing waveform), this may be analyzed so that characteristic metrics may be recorded. The metrics employed in this study to describe each return are: height above ground, peak height, peak width (evaluated as a Gaussian half-height width) and the exponential decay constant of the return shape between the peak and the next local minima. This study aims to show that peak height and width will be affected by shadowing deep in the canopy (and not be a good discriminator), but that exponential decay rate will be less affected (see results and discussion). The method of obtaining these is:

Georeference every 1 ns waveform sample to determine which waveforms (or part waveforms) fall into the field plots.

Determine a ground surface for the field plots.

Employ Gaussian fitting to describe each peak with a peak height and half-height width.

Employ exponential curve-fitting to determine the exponential decay rate of each peak.

Determine the above ground height of each peak (for comparison with field measured foliage data).

was achieved using the methods and code in Parrish [

was achieved by using the discrete LiDAR data set for the area flown. The ground-filtering algorithm (GroundFilter.exe) in FUSION [

As we are interested in decay rates of waveform peaks in (4), we do not need to search for additional ‘hidden’ peaks as other authors have. Peaks that are hidden through close proximity to another or low peak amplitude will not have sufficient data points after the peak to give a good decay. As such, we only select peaks in the waveform data with a corresponding discrete return. Gaussians were fit using Matlab’s fminsearch function, a simplex search method given in [^{2}

In (3), the location of each peak to be analyzed will have been determined. An exponential curve fit of the type ^{−}^{λx}^{2} value of the fit.

As this study is a proof of concept, only peaks that corresponded to a return in the discrete data set were analyzed (see step 3). Ground height was obtained by linear interpolation of the ground surface generated in (2) to the discrete return’s

The net result of this analysis is an attribute table for each discrete return. Unlike standard discrete returns, which are described only by their coordinates and an intensity (excluding secondary processing such as classification), each return will also be assigned a Gaussian peak height, mid-height width, decay rate and height above ground. Additionally, the ^{2}

As this study is a proof of concept, only the best peaks were selected, as poor peaks (with a low signal-to-noise ratio) are prone to poor curve fits and hence will mask any potential correlations. As a result of the amplification of background noise created through deconvolution, only waveforms with at least one intensity measurement greater than 25 (SNR > 2) were selected for analysis. After analysis, any peaks for which either the Gaussian fitting or the exponential curve-fitting yielded an ^{2}

136,191 waveforms were incident over the ten 0.16 ha plots analyzed. 127,167 of these had an intensity greater than 25 (93%), in which 173,129 returns were identified and analyzed, giving approximately 1.4 returns per waveform. A further 121,051 of these were discarded, as the ^{2} values of either (or both) the Gaussian and exponential curve fits were less than 0.75 (

We cannot compare individual pulse metrics with LAD values as LAD was defined over a larger volume, but we can compare average LAD over a volume with average parameter values.

We will not investigate the LiDAR metric values below 4 m that will be affected by understorey. Also, we will exclude bands where the LAD <1 m^{2} per m^{3} because the foliage is so sparse that most LiDAR pulses will simply miss it. ^{2}

^{3}, but varies substantially in localized regions (^{3}, it will be determined by the local value not the wider-scale average. Hence, individual values will vary, but if the sampling were fair, the average values of any metrics indicative of LAD should show strong correlations with the average value of LAD. However, sampling was not fair, as a return will only be registered if the pulse encounters sufficient material in one sample volume, thereby creating a tendency to overestimate LAD in more vegetated volumes. However, the fact that there is a moderate correlation between average decay rate and LAD is scientifically interesting. Note that as mentioned in Section 2.3, this correlation is only for the peaks with high intensities and good curve fits (^{2}^{2}^{2}^{2}^{2}

There are two ways that waveform LiDAR of a vegetated area may be interpreted:

The standard interpretation is as the result of a series of discrete ‘hard’ returns, such as may be achieved by well-separated layers of foliage. These supposed layers are of a thickness equal to or less than the return distance travelled by the laser pulse in one sampling period (0.15 m for 1 ns sampling).

As the result of transmission and reflectance from a volume of semi-transparent foliage which attenuates the radiation exponentially.

Clearly, the discrete Gaussian interpretation becomes similar to the volumetric interpretation once the separation of the Gaussians is equal to or less than the sampling period. Previous authors have used Gaussian fitting to distinguish a small number of returns separated by significantly more than the sampling period [_{T}_{0}_{R}

Thus, we see that if each canopy sample is of a constant density and distribution, it will have an absorption coefficient

The benefit of using an exponential decay rate is that it is not affected by shadowing. Shadowing affects both peak height and peak width due to the reduction in irradiance incident on lower surfaces. This—along with the complex nature of the surfaces—contributes to the observed variation in values for the same surface with these metrics. Ideally, shadowing could be accounted for by the summation of all samples in a waveform prior to the peak. However, much of the actual attenuation will not lead to a reflected signal at the sensor distinguishable from the background noise. This problem could be solved by equipment that is more sensitive and perhaps a different frequency laser, but such changes come with other issues and are not currently in the public domain. In the absence of an intensity-based method to account for shadowing, decay rate gives valuable information that is not affected by it. In summary, this interpretation provides insight into our observation that waveform decay rate has the potential to determine foliage density when averaged over a local area, bypassing some of the issues of using variables based on pulse intensity. Furthermore, if LAD can be shown to be moderately correlated with other (independent) metrics selected for that species and site (such as Normalized Difference Vegetation Index from aerial imagery), then the inclusion of spatially-averaged decay rate in a multiple regression may enable very robust estimates.

The primary aim of this study was to investigate whether waveform shape could be used to identify the source of individual LiDAR returns from forested landscapes. Waveform shape was quantified by means of various curve-fitting methods such as a peak height, half-height width and exponential decay rate, which were attributed to each return, significant enough to trigger the discrete-return sensing system. A subset of these with the best signal-to-noise ratio (SNR > 2) and curve fits (^{2}

Three-dimensional spatial averages of these waveform metrics were shown to give indications of surface type over larger areas and volumes. For example, ground returns on average presented waveforms with higher peaks, shorter widths and faster decays. Foliage returns conversely averaged lower peaks, wider pulses and slower decays. Decay rate showed the best correlation with average Leaf Area Density (defined in 2 m height bands over a 0.16 ha plot), when the average value for all returns in each volume was obtained. A linear correlation with an ^{2}

To explain the correlation between the exponential decay of returns and the LAD, a model was presented which approximates the canopy as a semi-transparent gas, and utilizes the Beer-Lambert Law to model the exponential decay of reflected intensity. This model has merit when used in conjunction with the standard interpretation of waveform LiDAR over vegetation, which explains the returns as a series of discrete layers leading to a one-dimensional Gaussian mixture. To further evaluate this model and to develop a robust correlation between foliage density and waveform decay rate, further testing is required. It may be advantageous to conduct this testing in a controlled environment, such as a laboratory, where small foliage samples may have LAD and occlusion measured precisely and then be scanned by LiDAR. This laboratory testing should also enable us to better quantify the extent of spatial averaging needed to an obtain optimal balance between the competing goals of accuracy and localization in our LAD estimates.

The work in this paper was funded by the Scion Capability Fund and used data collected by the Ministry for the Environment. Many thanks to Nigel Searles from the Ministry for the Environment for use of the data, and to Andrew Dunningham, Jonathan Harrington and David Pont from Scion for their help and knowledge. Particular thanks must also go to New Zealand Aerial Mapping, in particular Tim Farrier, for providing an excellent LiDAR service and for assistance with countless queries and demands.

^{®}

(

Raw waveform data recorded from two pulses over open pasture and forest cover.

Convolved and deconvolved forms of the waveforms in

Deconvolved LiDAR data (blue) with multiple Gaussian fitting (red, 1 on the left, two on the right), and decay rates of each peak estimated (green).

Peak height of Gaussian curves fitted to deconvolved waveform LiDAR

Bivariate frequency distribution of peak height of Gaussian curves fitted to deconvolved waveform LiDAR

Bivariate frequency distribution of half-height width of Gaussian curves fitted to deconvolved waveform LiDAR

Bivariate frequency distribution of decay rate of exponential curves fitted to deconvolved waveform LiDAR

Scatter plot of half-height width

Variation of Leaf Area Density (LAD)

Comparison of the average value in a set of 1,600 m^{2} × 2 m sample plots for (

Summary of number of returns analyzed for the ten field plots with leaf area density data.

1 | 12,558 | 11,616 | 4,570 | 28% |

2 | 16,150 | 14,939 | 5,608 | 27% |

3 | 12,305 | 11,261 | 4,942 | 33% |

4 | 11,332 | 10,286 | 4,207 | 29% |

5 | 12,326 | 11,075 | 4,541 | 31% |

6 | 13,810 | 13,130 | 5,488 | 32% |

7 | 12,687 | 11,992 | 4,598 | 27% |

8 | 14,849 | 14,210 | 4,847 | 25% |

9 | 15,862 | 15,145 | 8,746 | 45% |

10 | 14,312 | 13,513 | 4,531 | 24% |