1. Introduction
Understanding the composition and structure of forest ecosystems is critical for ecological understanding, and for developing effective forest management strategies. The increased availability of geographic information and remote sensing data for forested ecosystems, in combination with advances in computational modelling, has facilitated the development of predictive ecosystem mapping (PEM) models [
1,
2,
3]. PEM models are defined as methods that identify ecological-landscape relationships from spatial environmental data and field observations (as available) to predict vegetation composition across a landscape [
1,
2,
4,
5,
6,
7]. Using similar approaches but with varying objectives and units, PEM models have been broadly defined as predictive vegetation mapping [
1], which also encompass species distribution models (SDMs) [
2,
3,
8,
9], bioclimatic envelope models [
10,
11], habitat suitability or decision-support models [
12,
13,
14], and ecological niche models [
15,
16]. Such models originated from ecological niche theory whereby vegetation distribution is predicted using variables that either correlate with or define tolerance ranges of species [
1,
8,
9,
10,
17,
18].
There are three typical approaches to modelling community distributions from species and environmental data: (1) assemble species into communities and then predict, (2) predict species individually then assemble into communities, and (3) assemble and predict species together [
19,
20,
21,
22,
23]. The first approach uses some form of classification, ordination, or aggregation method to generate communities from individual species survey data. The presence-absence locations of each community are then compared with environmental predictors to describe their distributions. The second approach models each species distribution individually as a function of available environmental predictors. A community-level output is then generated from a classification, ordination, or aggregation of all the extrapolated individual species distribution models. The third approach characterizes each community and associated relationships with environmental data in a single process incorporating each species and all environmental data simultaneously. No single approach will be optimal in all circumstances, with the selection dependent on available data and study objectives [
21,
22]. Ohmann et al. [
24] argued that any approach selected should aim to retain information of individual species and patterns of co-occurrence in the final predictions. Joint species distribution modeling (JSDM) is an approach that accounts for both presence-absence and abundance data jointly across species to classify communities [
25,
26]. Combining independent SDM model outputs may be just as effective as JSDM though where the species of interest do not interact with one another [
25]. These techniques all aim to develop predictive models that utilize multidimensional descriptive variables to predict the distribution of vegetation across landscapes.
The use of multiple data types to describe environmental variables and forest structure can result in complex, often non-linear data, which challenges the accuracy of predictive models developed using traditional parametric approaches [
27,
28,
29]. Machine learning techniques have emerged from synergies between computer science and the identification of ecological processes and patterns, known as ecological informatics, which directly address issues of data complexity and non-linearity to generate accurate predictive models [
28,
29,
30]. Some examples of machine learning techniques that have been used for species distribution modelling include artificial neural networks, k-nearest neighbors, support vector machines, boosted regression trees, and random forests [
6,
8,
27,
31,
32,
33,
34]. The random forest algorithm, developed by Breiman [
35], has two key advantages for forest distribution mapping; (1) the novel variable importance measure, and (2) the proximity measures of similarity among data points [
27].
Many predictor variables for PEM models can be measured in the field or collated from various geographic information system layers and remote sensing data [
3]. Selection of predictor variables is critical to PEM accuracy [
36,
37,
38,
39], with each variable considered in terms of its importance to the distribution of an ecosystem and the ecological basis for its inclusion [
18,
37]. These variables can include a number of environmental gradients: direct gradients influence growth but are not consumed (i.e., temperature, pH); indirect gradients have a location-dependent correlation with forest distribution but have no functional effect on plant growth (i.e., latitude, altitude, slope); and resource gradients are consumed by vegetation (i.e., light, water) [
2,
5,
18,
36]. Interactions between structural and compositional gradients can influence stand development and are important to capture in ecosystem classification [
40]. These gradients of structure and composition describe the state of vegetation at a specific point in time (i.e., height, cover, strata), and can be broadly defined as ‘characteristic gradients’. Characteristic gradients are increasingly being included in PEM models with inputs summarizing forest structural complexity into a single metric or multiple metrics when three-dimensional data are available [
12,
41,
42].
Characteristic gradients can be quantified at broad spatial scales using remote sensing compared to spatially limited and cost-intensive field observations. Systematic field observations are often individual locations spread across large areas with limited sensitivity to the vertical assemblage of vegetation. Both horizontal and vertical structure can be obtained across a spatial continuum using light detection and ranging (lidar) technology. Lidar is an active remote sensing technique that uses laser ranging to determine the distance between the sensor and an object. Distance is calculated from half the time-lapse between when the laser pulse is emitted and the detection of the returned pulse after striking an object [
14,
43,
44]. Key sensitivity and accuracy limitations of commonly used optical and radar imagery [
45] can be overcome using lidar data [
44]. Lidar pulses measure the presence or absence of foliage and topographic structure in three dimensions [
46,
47,
48,
49], where optical and radar data responses are generally canopy and mid-canopy driven, respectively [
43]. Results from Kane et al. [
50] highlight that lidar can accurately characterize forest successional stages in the absence of field measurements. The ability of lidar to exploit small gaps in the forest canopy and capture understorey structure allows for the inclusion of lower strata attributes in predictive vegetation models [
51,
52,
53,
54]. Predictive models that include characteristic variables from lidar often use discrete derivatives such as canopy height, foliage cover, leaf area index (LAI), and vegetation density [
12,
29,
50,
51,
55,
56,
57,
58,
59,
60]. Recent studies using the vertical continuum of lidar data have typically focused on structure only classifications using random forest [
49,
61], excluding the use of ecological data pertaining to species composition dominating landscape-scale PEM.
In south-eastern Australia, eucalypt forests have traditionally been delineated based on dominant
Eucalyptus species without consideration for associated
Acacia, rainforest and understory species [
62]. The classification of rainforest focuses on the floristic composition at maturity and excludes changes in species composition over time due to changes in the composition that may be driven by disturbance (i.e., ingress of
Acacia and
Eucalyptus following fire) [
62]. Cameron [
62] argued that, while this traditional approach has advantages for interpreting community niche distributions, it is problematic at the ecotones that share dominant species, or where dominant canopy species are replaced by species characteristic of another stand type (e.g., patches of eucalypts within a rainforest mosaic after fire). The potential need to consider both climatic and structural gradients in PEM models for this region was further highlighted in a recent study that indicated both types of variables were important for explaining beta-diversity in the temperate forests of south-eastern Australia [
63]. Accordingly, the cool temperature forest landscapes of the Central Highlands region in south-eastern Australia are an ideal case study for examining the potential to improve traditional PEM approaches using structural information such as characteristic gradients. The landscape contains wet sclerophyll forest, rainforest and an ecotone between these two stand types described as ‘cool temperate mixed forest’ [
62], which has so far been excluded from the principal vegetation class system in the region (‘Ecological Vegetation Class’, EVC) [
64,
65].
This study seeks to examine the utility of combining recent developments in stand type predictions, using characteristic gradients in lidar structural attributes with complementary sources of ecologically meaningful information. The previously developed lidar-derived stand type map [
61] demonstrates the assemble-and-predict approach, which evaluates characteristic gradients (i.e., the vertical profile of vegetation density) across the landscape to predict the distribution of rainforest, ecotone and eucalypt stands. The distribution of species which best characterize stand types in the region were predicted individually and subsequently assembled into community groups in order to provide important ecological context. The lidar-derived and SDM-derived stand type classifications, in addition to auxiliary criteria which further stratify these stand types (i.e., specific individual species, elevation, canopy height) were combined using a series of decision trees to generate a highly detailed PEM model of the study region. This approach to PEM demonstrates a hierarchical, stepwise approach to combining lidar-derived stand type classifications with auxiliary information which provide important ecological context for predicting the spatial distribution of ecological communities. The PEM model demonstrates the potential for defining the spatial distribution of ecological communities at a much higher resolution than existing EVC-based classifications in the study region.
4. Discussion
The fusion of statistical analyses, lidar-derived structural profiles, and an assemblage of SDMs demonstrates an integrated approach to PEM that was highly sensitive to spatial variations in vegetation structure. The field observations are complemented by the continuous coverage of remote sensing information that indirectly quantifies characteristics of the stand function [
3]. The potential importance of capturing this spatial variability is highlighted when comparing the PEM model against the EVCs that are commonly used in the region. The PEM model shows a much greater level of detail in comparison to the EVCs, which are often spatially homogenized with large clusters of uniform stand types. In the little detailed information that has been published about the development of EVCs, it has been suggested that elements of structure were included in the classification process [
65]. Structure (spatial arrangement of forest components) variables in modelling are often used interchangeably with attributes describing function (forest processes) and composition (species abundance and diversity), where details related to one can be used as a proxy for another [
41]. The extent, level of detail, and number of variables by which forest structure can be defined by lidar differs greatly in magnitude from the
in situ measurements and those derived from optical remote sensing data like Landsat data [
54].
The high spatial variability of the PEM model suggests that many stand types do not have as continuous a cover as suggested by the EVC distributions, and this is consistent with our extensive field observations of these landscapes. Despite these differences in spatial resolution, there was generally a good level of agreement in the extent and locations of similar stand types predicted by the PEM model and the EVC maps. Our analysis highlights how complex structural and ecological information can complement one another through different modeling pathways. The lidar-derived landscape structure model was effective at constraining the much broader predictions of rainforest and eucalypt stands by the landscape species model. Conversely, the landscape species model was crucial in providing an ecological basis for constraining predictions of ecotone as indicated by the landscape structure model. Leveraging the congruence between predictions driven by characteristic and environmental gradients shows promise for mapping ecotone stands, as they are likely to have the most dynamic distributions.
Lidar-derived characteristic gradients provide a continuous representation of the full vertical vegetation profile, which is useful for identifying the structural gradient between rainforest, ecotone, and eucalypt stands. Lidar can resolve understorey structure with a level of detail that would not otherwise be possible using characteristic gradients derived from optical remote sensing [
3,
29,
54]. This vertical complexity can be measured at very broad spatial scales using lidar data, and repeat acquisitions can provide valuable opportunities to refine model performance and identify important structural changes in vegetation between specific points in time across the landscape. While field observations were limited to undisturbed stands, the output PEM model may be subject to error in the landscape structure model due to the time difference between field observations and lidar capture. Future studies would benefit from coincident field observation and lidar capture. Furthermore, airborne lidar provides a consistent measurement of structure at a scale that is not possible from the ground, including in regions with limited ground access. Nonetheless, one of the main limitations of using purely structural information to predict stand types is that the tolerance of the individual species that comprise these ecosystems to environmental conditions is not considered [
49,
61]. For example, eucalypt forest and montane eucalypt forest share similar structural characteristics but are strongly delineated by the environmental conditions that are suitable for either
E. regnans or
E. delegatensis. By incorporating SDMs into the PEM model, an ecological basis for stand type classification can be determined independently of isolated lidar-derived classifications [
96].
Lidar-derived structural profiles were intentionally excluded as predictors for SDMs, as the PAVD profiles represent assemblages of vegetation and their inclusion would have resulted in a large increase in model complexity [
49,
61]. The use of lidar in SDMs has typically included several lidar-derived metrics [
12,
56,
57,
59] rather than some derivative that leverages the entire stream of data collected along the vertical profile. Zimble et al. [
12] used airborne lidar-derived tree heights to classify western United States forests into single- and multi-layered stands with > 90% accuracy to be used as the structural component to a variation of the United States Forest Service PEM called habitat decision-support systems. Recent studies that have utilized these profiles have expressed similar challenges in distinguishing stand types with similar vertical vegetations structure [
49]. In this study, the use of maximum canopy height as a lidar metric was not useful for distinguishing stand types because
E. regnans occurs across most of the landscape and grows to consistent maximum heights.
Recent studies [this study, 29,51,55] have identified how remotely sensed gradients constrain the compositional space of stand or community types based on observed changes in forest structure at different successional stages. Hakkenberg et al. [
29] explores the use of an alternative approach to modeling community continua by incorporating lidar and hyperspectral remote sensing. Their study utilizes compositional ordination to summarize plot species information into axes of maximum floristic variation (similar to Simonson et al. [
51]). Hakkenberg et al. [
29] then classified communities using an unsupervised classification approach utilizing goodness-of-clustering evaluators and dissimilarity matrices. This approach is different to our study but achieves similar outcomes to the SIMPER analyses we used by identifying species that characterize forest communities based on underlying matrices of floristic dissimilarity. The use of dissimilarity matrices (e.g., Ferrier et al. [
100]) in both species information [this study, 29] and lidar-only metrics [
55] is increasingly being used to classify forest stand types. Compositional modelling in this study, and both the Hakkenberg et al. [
29] and Moran et al. [
55] studies utilized random forest for classification due to its generalizability and maximization of predictive accuracy with its ability to balance limited training data and high data dimensionality. Key differences in this study include the use of stand type classifications using the vertical continuum of lidar data through dimension-reduced principal components as predictor variables and the removal of highly correlated predictor variables (i.e., the landscape structure model). The ordination approach detailed in Simonson et al. [
51] and Hakkenberg et al. [
29] are ideally suited for landscapes with limited
a priori knowledge about community/stand species composition while the approaches detailed in Moran et al. [
55] and Fedrigo et al. [
61] are suitable when considering classification using lidar-only metric and vertical continuum data, respectively. This study utilizes
a priori knowledge of stand type composition, combined with the SIMPER approach to species classification at the landscape scale prior to the use of lidar data, for further stand type identification. Other studies that have performed similar classifications have been in less complex forests in North America [
29] and Europe [
51], while our study demonstrates the success of similar PEM approaches in structurally complex (multi-layered, dense, closed canopy temperate) forests of south-eastern Australia. All recent PEM studies highlight concerns with mapping stand or community types as discrete units, when their distribution across the landscape are ultimately continuous and can often overlap in areas of transition [
29,
55,
61].
The PEM model developed as part of this research provides the first spatial prediction of dominant stand types in the study region that considers both the ecological niche and the vertical continuum of vegetation structure. Lidar can penetrate forest canopies and can resolve understorey structure with a level of detail that would not otherwise be possible using characteristic gradients derived from optical remote sensing. Lidar-derived characteristic gradients are therefore particularly useful for identifying ecotone stands that are characterized by eucalypt overstorey with rainforest associated understorey. Cameron [
62] defined these ecotone stands as ‘cool temperate mixed forest’ but until now they have not been formally delineated. The ecotone may be the most likely to change in distribution due to the regeneration dynamics that occur in these stands with changes in microclimate, disturbance regimes and light conditions [
62,
63]. The succession of forest stands from one type to another suggests that these stand type maps reflect dynamic processes and will require regular assessment to evaluate their agreement with ground observations [
50,
58]. Repeat lidar acquisitions over time will provide opportunities to detect changes in the structure and distribution of these stand types in response to natural and anthropogenic disturbances at very high spatial resolutions.