1. Introduction
Tree deaths are significant in a circular ecology, since they provide resources to many organisms [
1] and habitat to many mammals and birds [
2]. Zielewska-Büttner et al. [
3] showed that the abundance of dead standing trees (snags) was the most important predictor of woodpecker habitat sections, stressing this way the importance of dead wood. At the same time, decaying wood is decreased at managed forests and this is a thread for organisms whose lives depends on dead wood [
4]. According to Rose et al. [
5] it is less expensive to preserve natural habitant than to regenerate it. There is, therefore, a need for understanding and maintaining the natural distribution of dead wood across a forest.
In Southern Australia, river red gums (
Eucalyptus camaldulensis) grow along the banks of Murray river and its floodplains. Floods are key ecological components, while extinctions of native species could be driven by anthropogenic factors [
6]. After the construction of Lake Hume in 1934, the flow of the Murray river is regulated [
7]. As a result the annual flow of the river have been decreased [
8] and fewer, shorter floods occur annually [
9]. George [
10] identified increased stress and health decline of red gums, as well as reduction in seeds production, which threatens decline of the population.
Furthermore, tree hollows play a substantial role in preserving biodiversity in Southern Australia [
2,
11], because most arboreal mammals and numerous native protected bird species rely on them for sheltering [
12]. Nevertheless, in Australia there are no hollow creators like the woodpeckers that exists in the northern hemisphere. For that reason, it takes hundreds of years for a hollow to be formed [
13] by insect and fungal attacks when access points are provided through damage caused by wind, storms. Furthermore, Gibbons et al. [
14] claims that dead trees or trees in poor physiological condition are more likely to contain hollows. In parallel, Lindenmayer and Wood [
15] and Goldingay [
16] predicted that in the near future shortage of hollows for colonisation will exist due to anthropogenic factors. For these reasons, this study aims to contribute in managing dead wood in Southern Australia by improving automated detection of snags.
This study focuses on detecting dead standing eucalypt trees from a Southern Australian forest that has been influenced by the reduced flow and floods of Murray river. Even though detection of both fallen [
17] and standing [
18] dead trees is important for managing dead wood in Southern Australia and preserving biodiversity, they are addressed differently from a classification perspective. Fallen trees are detected by identifying line-like features on the Digital Terrain Model (DTM) that is created from LiDAR point cloud [
19,
20]. In respect to dead standing trees, the following features could contribute into detecting them: their light reflectance since they absorb more green light [
21] and their shape since they are less leafy and more likely to have broken branches [
22]. Additionally, Yao et al. [
22] and Shendryk et al. [
23] performed tree delineation before classifying standing trees as dead or alive.
LiDAR is extremely useful in forestry because the laser can penetrate the forest canopy through the gaps between branches and leaves. Therefore, significant structural information about forest structure at tree level are collected. LiDAR systems used to record only discrete peak returns (discrete LiDAR) that usually corresponds to large branches and the ground. In these discrete LiDAR systems, a minimum distance between two recorded returns existed; for the Leica ALS50 sensor there has to be at least a 2.7 m gap between two recorded returns [
24]. Over the years and with the technological advances, LiDAR systems become able to digitise and record the entire backscattered signal. The backscattered signal is digitised and stored into a number of waveform samples equally spaced (e.g., 15–30 cm vertical resolution). The intensity of each waveform sample corresponds to equal pulse width since the signal is digitised at equal space time intervals and, therefore, the time distance between every two coherent waveform samples is constant. This is explained in the file format specifications of LAS1.3 [
25], LAS1.4 [
26] and Pulsewaves [
27]. In 2006-2009, finding peak points from the waveform data [
28,
29] Reitberger et al. [
30] Chauve et al. [
31] attracted the interest of the scientific community. Scientists were able to find additional returns that were not acquired by the discrete systems due to the minimum gap that had to exist between two returns [
28]. Reducing the waveform data to discrete returns is easier to handle the increased amount of information recorder and work with existing work-flows. Nowadays, many sensors acquire only waveform data (e.g., Trimble AX60) and discrete data are produced by analysing the waveforms and extracting peak points at post-processing. Therefore, the terms “extraction of peak returns from full-waveform (FW) LiDAR” and “discrete LiDAR” can sometimes be used interchangeably. Even though, Anderson et al. [
32] proved that FW LiDAR worth the extra processing and can estimate forest related parameters better, there are still questions to be answered; with the increased acquired pulse density and the advancement of technologies does the waveform still worth it? In addition, even though extraction of peak points from FW LiDAR could be identical to the delivered discrete data (tested by the authors), there are alternative ways of interpreting the waveform data, e.g., classification of waveforms according to their shape [
33] and voxelisation [
34].
Voxelisation is the process of inserting either the waveform samples or the discrete points into a 3D regular grid. Afterwards this voxelised data are used to derive terrain, canopy and other tree related metrics. The concepts of voxelisation (as explained below in
Section 2.2.1 or with similar interpretations) have been used in forestry for handling both discrete [
35,
36,
37,
38,
39,
40] and waveform [
41,
42,
43] data. In comparison to the discrete data, the waveform data are more likely to contain noise. Nevertheless, pulse density depends on the speed of the flight and the scanning pattern, while estimation of forest parameters is pulse density dependant [
44]. The intensity of each waveform sample corresponds to equal pulse width [
25,
26,
27]. They are, therefore, comparable to each other and the intensity values of each voxels can be normalised, overcoming the uneven density of LiDAR footprints [
45]. This study focuses on the algorithm that tackle height variations while working with eucalypt native forests and, therefore, comparison of the results between discrete and FW LiDAR data is not conducted. It uses voxelisation for interpreting the FW LiDAR data, but it could have voxelised the discrete LiDAR instead and applied the same methodology.
Classifications at tree level, while working with a native forest is a challenge since tree delineation is usually performed before health assessment [
22,
23]. Tree delineation can be achieved by firstly detecting local maxima from the Canopy Height Model (CHM) and then segmenting CHM into individual trees with the watershed algorithm [
46]. The introduction of the marker controlled watershed algorithm improved the results [
47] and further improvements were made by including structural information of the tree trunks and the under-storey layers of the canopy [
48]. Bottom-to-top delineation was proposed by Lu et al. [
49] for segmenting deciduous trees from data collected during the leaf-off season. Similarly, Shendryk et al. [
50] published an interesting bottom-to-top red gum delineation algorithm [
22]. Once trees are delineated, they can be classified as either dead or alive.
Previous work used multi-scale 2D analysis on Digital Elevation Models (DEM) or Canopy Height Models (CHMs) for detecting tree tops and delineating trees [
51]. Jing et al. [
47] also used a multi-scale 2D approach on the CHM of dense points cloud (45 points/m
) for tree delineation: at first Jing et al. [
47] used scale analysis for determining dominant tree size. Then, they produced segmentation maps at multiple scales using the marker-controlled watershed algorithm and, finally, fused the multi-layered segments. Hu et al. [
52] improved this approach by using Gaussian analysis to determine whether a segment consists of multiple trees.
This paper attempts to address the limitations of working with eucalypt trees in a native Australian forest. Detection of dead standing eucalypt trees from full-waveform LiDAR without tree delineation has been proposed before and it was shown that it performs better than a random prediction [
53]. In image processing and computer vision, scientists try to identify if objects, like faces, exist within a 2D image and detect them by extracting features [
54]. Similarly, Miltiadou et al. [
53] extract 3D structural features from local areas around dead trees (positive samples) and live trees (negative samples) using single size 3D-windows and trains an object detection classifier. For tackling height and size variations of trees, this study adds on existing knowledge by introducing the usage of multi-scale 3D-windows. The classifier creates three probabilistic fields using three different sizes of 3D-windows and then merge the results, before proceeding to thresholding, filtering and assignment of predicted locations of dead standing trees. Using cross validation and comparison with the single size 3D-windows approach, it was proven that in comparison to a single-size 3D-windows approach, the multi-scale 3D-windows methodology improves prediction.
4. Discussion
Dead wood is extremely important for managing biodiversity, since while decaying it provides resources for numerous organisms [
1]. Fungi plays a substantial role in the formation of hollows and wood decaying, which further supports biodiversity and, consequently, a resilient ecosystem [
59]. In Australia, tree hollows are formed by fungi and provide shelters to native arboreal mammals and birds [
2,
11]. They are, therefore, important for managing biodiversity. For detecting snags from infrared imagery, Polewski et al. [
60] used a two stages detections approach. At first Gaussian analysis was used to estimate locations of dead trees and they used prior knowledge about shapes and density of local areas manually labelled as snags. Similarly, Miltiadou et al. [
53] used prior knowledge by extracting features from 3D-windows around dead and live trees, labelled in field data, and used these features to perform the detection.
The study presented in this article aims to increase resilience to tree heights and sizes, while working with native forests. It introduces the usage of multi-scale 3D-windows for detecting snags. For assessing the performance of the new proposed algorithm, the multi-scale 3D-windows approach has been compared with a single size 3D-windows approach and it was proven that it has improved detection of dead standing eucalypt trees without tree delineation. It was shown before that the single size 3D-windows approach performs better than a random distribution of predicted dead tree locations with equal average density per plot [
53]. Additionally, since the multi-scale 3D-windows approach improves both recall and precision once compared with the single size 3D-windows approach (
Figure 13), the improvement in detecting dead standing eucalypt trees is doubtless (precision and recall are explained in
Section 2.2.7).
Overall, the proposed methodology confers better results than what has been done before without tree delineation, but needs further improvements. It is noticeable that the recall of Val4 is lower when the multi-scale 3D-windows approach is applied, but its precision increases by . This indicates potential over-prediction of dead trees; if too many dead trees are predicted, then precision is high since the algorithm manages to predict a high percentage of the actual dead trees but recall is low because the probability of predicted dead trees to be actual dead trees is low. This issue may be solved by adjusting the filters and thresholds.
Another limitation is the unknown noise within the field data and the limited number of trees measures in the field. The locations of the trees seem to have no consistency once plotted on CHM (
Figure 2). This may occur because some of the dead trees are really small or sometimes only the trunk has remained, which is not acquirable by the LiDAR system. These data only produce noise and reduce the accuracy (precision and recall) of the classifier. This is a major limitation, since the classifier is trained using noisy field data, while at the same time there were occasions were less than 15 dead trees were used in the
k-NN for a specific window size. Increased training data and manual inspection of them should improve prediction.
Despite the aforementioned limitations, it is demonstrated that on average the proposed multi-scale 3D-windows approach improves the prediction. This opens up possibilities of further research in native forests that contain complex shapes. The effectiveness of the proposed methodology could be tested in various applications, like biomass estimation [
61], leaf area index [
62] and estimating total stem volume [
63]. The sizes of the windows will have to be adjusted to the new site but this can be done by observing the distribution of tree heights using histogram analysis.
In this study, the
k-NN algorithm was used because it does not fit the data into a single statistical model and it is, therefore, expected to give good results in detecting objects with variant shapes like snags. Nevertheless, Wu and Zhang [
64] has recently showed that Support Vector Machines (SVM) work better on classifying tree species in relation to
k-NN and this was occur due to the noise. This is reasonable considering that
k-NN is very sensitive to noise. In the study presented here, the selected window sizes are smaller than the average height/size of each tree size category to reduce noise. Additionally, the random forest is first used to select the most important parameters for distinguishing dead from live trees. Only the most important parameters are used in the classifier for reducing dimensionality and, consequently, noise. Even though the selection of this approach is done based on knowledge about the acquired data, in future work it worth checking the performance of further state-of-art machine learning approaches. For example, SVM supports high dimensionality while classifying data and Zhao et al. [
65] showed that SVM performs better than the maximum likelihood classifier and linear regression models for estimating various forest parameters. Furthermore, with the increased computational capabilities of computers neural networks are used in recent literature for advancing forest inventories [
66,
67]. Working with small classification datasets there is a risks that neural networks may conclude that the relevant features within the training dataset are noise. Nevertheless, this possibility has been decreased with the big data era and increased training samples.
It further worth highlighting the important structural parameters identified for distinguishing dead trees from alive, which are (1) Height_Std: the standard deviation of the heights inside the 3D-window, (2) Top_Patch_Len_Std the standard deviation of the length of all the top patches (a top patch is defined as the number of adjacent non-empty voxels starting counting from the top of the column of interest) and (3) the Standard Deviation of the distances between the central voxel and every non-empty voxel lying inside the 3D-window. These parameters are reasonable considering that dead trees have less leaves in comparison to live trees and are, therefore, more likely to have bigger height differences within the 3D-windows. These parameters can be adopted in other sites and applications; for example for the detection of infected trees and for distinguishing deciduous from evergreen trees during the leaf-off season due to their structural similarities.