Regional Variation in Forest Canopy Height and Implications for Koala (Phascolarctos cinereus) Habitat Mapping and Forest Management

Previous research has shown that the Koala (Phascolarctos cinereus) prefers larger trees, potentially making this a key factor influencing koala habitat quality. Generally, tree height is considered at regional scales which may overlook variation at patch or local scales. In this study, we aimed to derive a set of parameters to assist in classifying koala habitat in terms of tree height, which can then be used as an overlay for existing habitat maps. To determine canopy height variation within a specific forest community across a broad area in eastern Australia, we used freely available Airborne Laser Scanning (ALS) data and adopted a straightforward approach by extracting maximum-height ALS returns within a total of 288 30 m × 30 m “virtual” ALS plots. Our findings show that while maximum tree heights generally fall within published regional-scale parameters (mean height 33.2 m), they vary significantly between subregions (mean height 28.8–39.0 m), within subregions (e.g., mean height 21.3–29.4 m), and at local scales, the tree heights vary in response to previous land-use (mean height 28.0–34.2 m). A canopy height dataset useful for habitat management needs to recognise and incorporate these variations. To examine how this information might be synthesised into a usable map, we used a wall-to-wall canopy height map derived from ALS to investigate spatial and nonspatial clustering techniques that capture canopy height variability at both intra-subregional (100s of hectares) and local (60 hectare) scales. We found that nonspatial K-medians clustering with three or four height classes is suited to intra-subregional extents because it allows for simultaneous assessment and comparison of multiple forest community polygons. Spatially constrained clustering algorithms are suited to individual polygons, and we recommend the use of the Redcap algorithm because it delineates contiguous height classes recognisable on a map. For habitat management, an overlay combining these height classification approaches as separate attributes would provide the greatest utility at a range of scales. In addition to koala habitat management, canopy height maps could also assist in managing other fauna; identifying forest disturbance, regenerating forest, and old-growth forest; and identifying errors in existing forest maps.


Koala Habitat
Wildlife managers are required to balance competing demands, firstly, to provide solutions to ensure the sustainability of wildlife populations, and secondly, to mitigate potential conflicts arising from human resource needs. Resolution of these conflicts requires adequate knowledge about how wildlife uses habitats, this information is collected using, for example, presence/absence surveys, dietary studies, and telemetry. In the case of the Koala (Phascolarctos cinereus, henceforth koala), use of habitat is primarily influenced by the density of preferred food trees, e.g., [1]. This knowledge is transferred to vegetation maps to generate a habitat map which ranks vegetation communities in terms of their relative importance to the koala [2]. These maps are derived from regional-scale vegetation maps covering~100,000 hectares and might not capture habitat variability at site or landscape scales relevant to the utility needs of both the koala and habitat management stakeholders. We refer to "stakeholders" as a wide group including government agencies, developers, individuals, and community groups; all of whom have an interest in making decisions about koala habitat.
Larger trees have been identified as a factor influencing koala habitat quality by many studies, using methods ranging from; correlations between koala use and tree size at the plot scale; to mixed multi-scale models with additional environmental factors (Table 1). Mixed models have provided varying results, but no studies published have shown that the koala prefers smaller trees. Table 1. Previous studies investigating use of larger trees by the koala. DBH = diameter at breast height.

Method Scale Finding Reference
Use vs. Availability by DBH class plot positive correlation with food tree, negative with non-food trees [3] tree girth and koala use koala radio tracking positive correlation with food tree size [4] generalised linear mixed model koala radio tracking taller non-eucalypts on hot days [5] mixed-effects logistic model in 3 regions with 9 habitat variables tree tree size importance varies regionally. Noosa: Ranked 1 of 9. Port Stephens: 7 of 9. Ballarat: 6 of 9 [6] koalas observed in trees, generalised additive models, generalised additive mixed models site (7.6 ha) increasing DBH highly positively significant [7] generalised linear mixed effects model, 3 seasonal surveys transect positively associated with DBH, negative with tree height [8] Use vs. Availability by DBH class transect positively associated with large trees [9] 2-way ANOVA koala radio tracking positively associated with larger trees within plots [10] DBH, t-test plot positively associated with larger trees within plots [11] crown size, t-test plot positively associated with crown size within plots [11] Logistic regression and hierarchical partitioning of 12 variables plot larger DBH had strongest independent effect, but negative [12] 9 variables, mixed-effects resource selection probability functions plot/landscape tree species and tree height most important factors at both scales [13] Observed and expected tree visitation based on DBH tree koalas use trees significantly larger than expected [14] one-way ANOVA koala radio tracking large trees selected disproportionately to availability [15] The preference for larger trees has been attributed to several factors; (a) there may be a threshold tree size that guides the selection of foraging behaviour [4]; (b) larger trees have larger canopies providing increased food resources [7,9] and shelter [5,13]; and (c) tree health and vigour, as determined by environmental gradients [13], in more challenging environments trees are likely to be smaller, with less nutritional quality and greater levels of deterrent foliar chemicals [16], and site competition also affects tree size [17]. In addition, other factors have been recognised, for example, soil fertility influences habitat occupancy at the landscape scale [3,18] and surrogate measures of fertility (slope and elevation) have been incorporated into some koala habitat models (e.g., [19]). A study by Callaghan et al. [3] suggested that, where detailed structural mapping is available, 'an overlay map could indicate areas where koala habitat quality is enhanced by the predominance of large, older-growth trees.' Identifying these areas might therefore provide useful information for habitat management at a range of scales. species and are not readily applicable to the more structurally-heterogenous Eucalyptus species forests found in Australia. These forests typically have sparse, highly clumped canopies and erectophile leaf angle distributions, allowing ALS pulses to penetrate deeper into the canopy; therefore, they may underestimate dominant or maximum height [42]. Conversely, denser canopies in some tree species affect the accuracy of the underlying terrain model from which canopy heights are obtained [43]. Considering these two factors, point densities above 0.5 pts/m 2 provide forest metrics, including height, that did not differ significantly from those obtained with a density of 10 pts/m 2 [43]. ALS metadata typically specifies a target pulse density of 1 pt/m 2 . At broader scales, ALS has been used to calibrate space-borne LiDAR data to produce vegetation community structural classifications in Queensland [36], and across Australia [44].
Koalas prefer vegetation communities with a high proportion of food trees [2,3], and they also prefer habitats with larger trees (Table 1). However, no previous research has attempted to combine these factors in a habitat management map. Given extensive ALS coverage is available in the southeast Queensland and the New South Wales north coast regions where koalas are widely distributed, our aim is to evaluate the utility of ALS in developing high resolution canopy height models which can complement existing low resolution regional-scale habitat maps and further assist wildlife managers to better manage koalas. To do this, we ask the following research questions: (1) do published herbarium descriptions adequately capture height variability at regional, sub-regional, and intrasubregional scales, and; (2) how can canopy height variability best be incorporated into a koala habitat map that can be useful to wildlife managers across different spatial scales.

Materials and Methods
Using herbarium descriptions as a baseline, we first examine height variability at different scales, and then, how this information could be incorporated into existing koala habitat management maps to provide greater utility at a range of scales. We make the following hierarchical scale definitions: • Regional: The whole study region encompassing southeast Queensland and the adjoining Tweed Local Government Area in New South Wales (2,500,000 hectares), • Subregional Group: Any identified area which is geographically separate from other areas within the study region (30,000-230,000 ha), • Intra-subregional Group: Any identified area which is geographically separate from other areas within a subregional group (400-1600 ha), or our subregional study area, • Polygon Group: For map production, we used contiguous groups of 25 m × 25 m cells derived from a vector polygon within intra-subregional groups (<100 ha).

Variation in Forest Canopy Height
We chose a floristically variable vegetation community with tree species known to be used by the koala [2,45] within the Southeast Queensland (SEQ) Interim Bioregion [46], which covers SEQ and part of the New South Wales north coast. In SEQ, the community is mapped as Regional Ecosystem (RE) 12.11.3 [47]. Its floristic composition is described by the Regional Ecosystem Descriptions Database [48] as mainly composed of Eucalyptus siderophloia (Grey Ironbark) and E. propinqua (Small-fruited Grey Gum), with other patchily distributed species including E. microcorys (Tallowwood), Lophostemon confertus (Brush Box), Corymbia intermedia (Pink Bloodwood), E. biturbinata (a Grey Gum), E. acmenoides (White Mahogany), and E. tereticornis (Queensland Blue Gum). Vegetation communities similar to RE 12.11.3 extend into New South Wales [49,50]. RE 12.11.3 forest structure is described in a Technical Description [51], and varies from Tall Open Forest to Woodland, with overstorey heights ranging between 15-37 m (average 28.1 m). The Description uses 66 Herbarium CORVEG plots, data collected at these plots includes floristic composition and crown cover from all forest structural layers at relatively undisturbed and representative plots, CORVEG data was collected between 1991-2011, 30 of these plots were collected within 10 years of ALS acquisition which began in 2009. We aimed to use all 38 plots within the ALS footprint but discarded five sites which had been disturbed by development post-CORVEG data collection. RE 12.11.3 in southeast Queensland and the associated NSW community covers 76,390 ha. Our study was limited to those areas containing ALS data (49,680 ha) ( Figure 1).
We developed a workflow ( Figure 2) providing a multiscale approach to assess the forest height variability within the vegetation community (henceforth referred to as RE 12.11.3) at the regional, subregional, and intra-subregional scales.
Forests 2021, 12, x FOR PEER REVIEW 5 of 45 1991-2011, 30 of these plots were collected within 10 years of ALS acquisition which began in 2009. We aimed to use all 38 plots within the ALS footprint but discarded five sites which had been disturbed by development post-CORVEG data collection. RE 12.11.3 in southeast Queensland and the associated NSW community covers 76,390 ha. Our study was limited to those areas containing ALS data (49,680 ha) ( Figure 1).
We developed a workflow ( Figure 2) providing a multiscale approach to assess the forest height variability within the vegetation community (henceforth referred to as RE 12.11.3) at the regional, subregional, and intra-subregional scales. We examined a forest community used by koalas (Regional Ecosystem 12.11.3) occurring across Southeast Queensland and the New South Wales north coast (Tweed). Within this study area, six distinct subregions (dotted black ellipses) of RE 12.11.3 can be recognised by spatial proximity or separation by rivers or state borders. We selected the Koala Coast to examine the intra-subregion canopy height variability ( Figure 3) and an additional site on the Sunshine Coast to examine the height variability resulting from historical forest disturbances (blue dot; see Figure 4). ALS coverage is shown (pale yellow background); vegetation community RE 12.11.3 is covered by ALS (dark green, with similar areas in NSW), and RE 12.11.3 not covered by ALS (light green). Random plots (205) within ALS-covered RE 12.11.3 (white circles) and CORVEG plots (33) within ALS-covered RE 12.11.3 (black circles) are shown. We examined a forest community used by koalas (Regional Ecosystem 12.11.3) occurring across Southeast Queensland and the New South Wales north coast (Tweed). Within this study area, six distinct subregions (dotted black ellipses) of RE 12.11.3 can be recognised by spatial proximity or separation by rivers or state borders. We selected the Koala Coast to examine the intra-subregion canopy height variability ( Figure 3) and an additional site on the Sunshine Coast to examine the height variability resulting from historical forest disturbances (blue dot; see Figure 4). ALS coverage is shown (pale yellow background); vegetation community RE 12.11.3 is covered by ALS (dark green, with similar areas in NSW), and RE 12.11.3 not covered by ALS (light green). Random plots (205) within ALS-covered RE 12.11.3 (white circles) and CORVEG plots (33) within ALS-covered RE 12.11.3 (black circles) are shown. . Workflow implemented to examine the forest height variability at the regional, subregional, and intra-subregional scales for the nominated forest community (Regional Ecosystem 12.11.3). We assessed Herbarium field plots (CORVEG) to examine how well the published information captures the height variabilities within RE 12.11.3 at these scales.

Figure 2.
Workflow implemented to examine the forest height variability at the regional, subregional, and intra-subregional scales for the nominated forest community (Regional Ecosystem 12.11.3). We assessed Herbarium field plots (CORVEG) to examine how well the published information captures the height variabilities within RE 12.11.3 at these scales. Within the study area, we nominated six sub-regions ( Figure 1). The Sunshine Coast and Western subregions have 38% and 29% ALS coverage, respectively, but the other subregions are entirely covered. Within the region covered by ALS, in addition to the 33 CORVEG plots, we used ArcMap [52] to generate 205 random 30 m × 30 m "virtual" plots within RE 12.11.3, which were at least 100 m apart.
The technical description for RE 12.11.3 provides the average canopy height for the RE by estimating the heights of at least three trees representing the median canopy height in each of the 66 [53] representative 20 m × 50 m plots [54]. Owing to the absence of individual plot orientation, we could not replicate this procedure, so we used 30 m × 30 m plots centred on the CORVEG plots and extracted the maximum canopy height in each of these plots using ALS.

Subregional-Scale
Within the wider region, we nominated six spatially separate subregional groups ( Figure 1) and ensured that each subregion had at least 30 of the 205 virtual plots to enable an assessment of the cross-region height variation and compare it against the CORVEG plot heights.

Intra-Subregional-Scale
To examine the variability at finer scales, we selected the geographically distinct Koala Coast subregion (Figure 1), which also has the highest density of koalas in the wider region. Within the Koala Coast, we nominated seven distinct intra-subregional groups based on spatial separation, with two close groups demarcated by differences in elevation ( Figure 3). Within the study area, we nominated six sub-regions ( Figure 1). The Sunshine Coast and Western subregions have 38% and 29% ALS coverage, respectively, but the other subregions are entirely covered. Within the region covered by ALS, in addition to the 33 CORVEG plots, we used ArcMap [52] to generate 205 random 30 m × 30 m "virtual" plots within RE 12.11.3, which were at least 100 m apart.
The technical description for RE 12.11.3 provides the average canopy height for the RE by estimating the heights of at least three trees representing the median canopy height in each of the 66 [53] representative 20 m × 50 m plots [54]. Owing to the absence of individual plot orientation, we could not replicate this procedure, so we used 30 m × 30 m plots centred on the CORVEG plots and extracted the maximum canopy height in each of these plots using ALS.

Subregional-Scale
Within the wider region, we nominated six spatially separate subregional groups ( Figure 1) and ensured that each subregion had at least 30 of the 205 virtual plots to enable an assessment of the cross-region height variation and compare it against the CORVEG plot heights.

Intra-Subregional-Scale
To examine the variability at finer scales, we selected the geographically distinct Koala Coast subregion (Figure 1), which also has the highest density of koalas in the wider region. Within the Koala Coast, we nominated seven distinct intra-subregional groups based on spatial separation, with two close groups demarcated by differences in elevation ( Figure 3).  For the Koala Coast intra-subregional groups, we generated a CHM covering all RE 12.11.3 polygons and populated an aligned rectangular 25 m × 25 m polygon grid with CHM height values.
To further investigate the intra-subregional variation, and to examine the influence of disturbance on the forest height, we used data from a 60-hectare study site in the Sunshine Coast subregion. This focus study site encompassed two tenures with different land use histories-a private property "Quinlans" and Mapleton National Park. Quinlans was a dairy farm up until the 1960s, with the vegetation mostly cleared; the subsequent regrowth vegetation is now classified as Remnant Vegetation [47]. Mapleton NP was declared a forest reserve in 1923, and its vegetation is also mapped as Remnant. Figure 4 shows the vegetation cover in 1940; in some areas of Quinlans, the vegetation was almost completely removed, and the remaining forest areas were extensively thinned to an open woodland. A 1958 aerial photo (not shown) indicates that tree regrowth was actively occurring in Quinlans at that time. Areas of possible disturbance in Mapleton NP 1940 are also shown and might be because of previous logging, fire, and/or storm damage or topographic influences (dryer northerly aspect).
Within this site, we used 50 30 m × 30 m virtual plots spaced on a 100-m map grid (Map Grid of Australia MGA94 Zone 56). In May 2019, we collected data, including tree species, diameter at breast height (DBH), and MGA coordinates, for every tree over 10-cm DBH at 44 sites. Data collection was aimed at obtaining detailed knowledge of both tree species variability and structural variability across the site and to provide a context to assist the interpretation of the ALS data.

Height Variability-Extracting Maximum Canopy Height from ALS
We obtained all ALS data from ELVIS (Elevation Information System) [28] at https: //elevation.fsdf.org.au/, accessed on 24 June 2021. This site collects elevation data from many Australian government agencies and includes classified ALS point clouds. Our regional study area encompassed multiple ALS acquisition projects ( Table 2); all available metadata showed that the minimum spacing requirement of 0.5 returns/m 2 , appropriate for extracting forest metrics, such as maximum canopy height, was met [43]. Table 2. Available metadata for ALS data used in this paper. Point density was not included, but most datasets include ground point spacing. Following Reference [43], we regard all datasets to be suitable for our study. All data was downloaded from ELVIS (Elevation Information System) at https://elevation.fsdf.org.au/, accessed on 24 June 2021.

Project and Year
Height ( For all virtual plots, we clipped each ALS tile to the 30 m × 30 m plot footprints using the lasclip tool [55] in ArcMap [52] and obtained the maximum heights using lasheight [55]. We then manually extracted the maximum height from the Arc Map's layer display while visually checking all plots for ALS misclassifications (e.g., birds) and correcting when necessary.

. Statistical Analysis
We first examined the datasets for sample size, normality, and variance ( Table 3). The sample sizes varied considerably; the majority were not distributed normally, and uneven variances occurred at all the scales. For two-group comparisons, we used Welch's t-test, and for more than two groups, we determined that ANOVA with Games-Howell post hoc tests (with bootstrapping) were suitable for analysis, since these tests do not assume normality, equal variances, or sample sizes [56]. Table 3. Virtual ALS plot datasets are characterised by uneven sample sizes, non-normal data distribution, and unequal variances. We used both the Kolmogorov-Smirnov (with Lilliefors Significance Correction) and Anderson-Darling normality tests. Non-normal datasets are shown in blue, and highly unequal variances are shown in orange. * denotes maximum significance level reported by software package.

Map Production-Canopy Height Model (CHM)
This part of our research examined the implications of the regional variations in tree height in the context of providing a forest height map that could be used for habitat or forest management at the regional, subregional, and intra-subregional scales. We considered using individual tree detection (ITD), but since we only required the maximum canopy height, we decided that a canopy height model (CHM) would suffice for our purposes, with the added advantage of this approach being much easier to implement over broad areas. For the Koala Coast subregion and the Sunshine Coast study site, we extracted all ALS data overlapping RE 12.11.3 polygons and generated a 25-m resolution CHM using the canopymodel batch function in Fusion software [57] to extract the maximum height for each 25 m × 25 m cell. We again visually checked all tiles and removed misclassifications (birds, telecommunications towers, and a high-voltage powerline). We generated a 25 m × 25 m "fishnet" aligned with the CHM and extracted the CHM height values to the fishnet for analysis. We chose this cell size (0.0625 hectares) as a compromise between the Regional Ecosystem minimum mapping unit (0.5-1 ha) in Southeast Queensland [54] and some smaller units in the Tweed dataset (e.g., 0.025 ha). This cell size also provides sufficient resolution for decision-making at the habitat management and development scales. reserve in 1923, and its vegetation is also mapped as Remnant. Figure 4 shows the vegetation cover in 1940; in some areas of Quinlans, the vegetation was almost completely removed, and the remaining forest areas were extensively thinned to an open woodland. A 1958 aerial photo (not shown) indicates that tree regrowth was actively occurring in Quinlans at that time. Areas of possible disturbance in Mapleton NP 1940 are also shown and might be because of previous logging, fire, and/or storm damage or topographic influences (dryer northerly aspect).

Producing a Classified Height Map
For those involved in koala habitat or forest management decision-making, classified maps are a valuable tool that enable a feature of interest to be interpreted, compared, and evaluated in a spatial context. We now address our second research aim, "How can canopy height variability best be incorporated into a map?" Our approach was to use statistical clustering methods. Spatial clusters can be defined as a grouping of similar objects in a geographic space that are more homogenous than those from different clusters and that occupy contiguous regions [58,59] a cluster analysis uses analytical pattern recognition to group similar data points [60]. There are many different clustering algorithms, and a selection of the most suitable method requires some consideration of the data structure, e.g., data density and arrangement, and purpose, i.e., whether we sought out global or more localised cluster identification. Our dataset had distinctive characteristics: (a) height was the only variable, (b) the sample size (i.e., CHM) was 100%, (c) the data was in the form of unconnected polygons, each composed of connected groups of cells, and (d) the polygon groups had different sizes and shapes, i.e., they may be compact, linear, or complex. Unconnected polygon groups required a nonspatial global approach, whereas individual polygon groups (i.e., with contiguous cells) were also suited to spatially constrained algorithms, since they met the contiguity requirement. Our goal was to evaluate several clustering algorithms and recommend those that (a) displayed canopy height information in a user-friendly map format useful for habitat management, (b) were reproducible using readily available software, and (c) were amenable to a high degree of automation. For map production, we used two polygon groups drawn from the Koala Coast and our Sunshine Coast study site.
For these areas, we firstly derived maps showing interquartile height ranges (i.e., four classes with equal numbers of data points) and visually identified six apparent clusters of higher canopy heights. Cluster maps were generated using algorithms implemented in Geoda [61], chosen because (a) it is freely available and regularly updated software with a graphical user interface (https://geodacenter.github.io/download.html, accessed on 1 October 2021), (b) the data formats are compatible with most GIS software, and (c) a wide selection of clustering algorithms is available. We chose eight candidate algorithms suited to our cellular data and used default values for each algorithm. For each algorithm we produced maps with two, three, and four classes, and to assess the map utility and interpretability, a three-member expert panel subjectively evaluated each output using the criteria shown in Table 4. Table 4. Procedure used by an expert panel to rate each cluster map. Binary scoring (yes or no) for each identified feature was designed to discourage any assessment subjectivity.

Criterion Evaluation Score
Is the feature identified? yes 1 no 0 Is the areal extent similar? Within ±20% of the IQR * map 1 More than 25% different to the IQR map 0 Is the feature internally contiguous? all cells joined (edges and/or corners) 1 no better than the IQR map 0 * IQR Interquartile range.

Regional Variation
The RE 12.11.3 technical description [53] uses canopy heights obtained from an average of three to five trees at each CORVEG plot [54] and summarises the data from 66 plots as the mean, maximum, and minimum heights (Table 5). We were not able to replicate this procedure, so we recorded the maximum height in each of our 33-plot ALS subsamples of the CORVEG plots, which we used to compare with all the other ALS virtual plots. Since we measured the maximum height at each CORVEG plot (Table 4), the ALS plots were substantially higher (maximum height 51.4 m) than the technical description (maximum average height 37 m), and the average maximum ALS height (33.2 m) was higher than the published mean height (28 m) [53]. These results indicate that the CORVEG plots were not significantly different in height to the random plots (p = 0.057) and so, for RE 12.11.3, the Herbarium technical description adequately captures the height variations across the wider Southeast Queensland region and the adjoining Tweed Coast in New South Wales.

Subregional Variation
The 205 randomly selected virtual plots across the Southeast Queensland Bioregion were assigned to six identified subregional groups to, firstly, examine the variations between the subregions and, secondly, to identify those subregions outside the CORVEG height ranges. The average maximum ALS canopy heights measured at the CORVEG plots were substantially below the average of all the subregional group heights, except the Koala Coast and Western. The Sunshine Coast and Tweed subregional groups had higher canopy heights with a larger range of heights; in contrast, the Western group had the lowest canopy. The Koala Coast canopies were also lower, with the smallest variations in height range ( Figure 5). These results indicate that the CORVEG plots were not significantly different in height to the random plots (p = 0.057) and so, for RE 12.11.3, the Herbarium technical description adequately captures the height variations across the wider Southeast Queensland region and the adjoining Tweed Coast in New South Wales.

Subregional Variation
The 205 randomly selected virtual plots across the Southeast Queensland Bioregion were assigned to six identified subregional groups to, firstly, examine the variations between the subregions and, secondly, to identify those subregions outside the CORVEG height ranges. The average maximum ALS canopy heights measured at the CORVEG plots were substantially below the average of all the subregional group heights, except the Koala Coast and Western. The Sunshine Coast and Tweed subregional groups had higher canopy heights with a larger range of heights; in contrast, the Western group had the lowest canopy. The Koala Coast canopies were also lower, with the smallest variations in height range ( Figure 5). The Koala Coast had substantially lower mean maximum heights (31.32 m) than the average of the other near-coastal regional groups and a lower standard deviation in The Koala Coast had substantially lower mean maximum heights (31.32 m) than the average of the other near-coastal regional groups and a lower standard deviation in maximum height (3.89 m) than the average of these groups (6.5 m). The Koala Coast and Western subregions had lower average canopy heights than CORVEG; all the other subregions had higher average heights than CORVEG. The height distribution was positively skewed, particularly for the Gold Coast and Western Regions (Table 6). Table 6. Variations in the canopy height between CORVEG and six geographically distinct subregional groups within the study area ( Figure 1). The differences between CORVEG and the subregional plots were not significant for the Sunshine Coast and Tweed subregions, and between the subregions, the Koala Coast and Western subregions were not significantly different to each other, but both were significantly different to the other four subregions. These four near-coastal subregions (Brisbane, Gold Coast, Sunshine Coast, and Tweed) were not significantly different from each other (Table 7). We previously identified ( Figure 3) seven intra-subregional groups within the Koala Coast subregional group, and for this part of our analysis, we used a 100% sample rate, i.e., all 25 m × 25 m cells within each group. Figure 6 shows similar within-group variations, except for Group 6, where there was a greater range of height values. Table 8 shows height variations between these groups; compared to CORVEG, these groups showed a more compact canopy (standard error, standard deviation, and variance were all lower), and the CORVEG distribution was strongly positively skewed. The statistical comparisons are shown in Table 9.     The effects of forest disturbances are readily apparent at the Sunshine Coast intrasubregional study site, which encompasses two land tenures with different land use histories. The canopy heights fall within the CORVEG height ranges but are substantially less than Sunshine Coast virtual plots, and the canopy height in Quinlans averages 6.2 m below the adjoining Mapleton National Park (Figure 7 and Table 10) but still falls within the CORVEG parameters (e.g., range and standard deviation from the mean). Pairwise differences between CORVEG and the entire study site, and CORVEG and Mapleton National Park, were not significantly different, but all other pairwise comparisons were significantly different (Table 11).

Intra-Subregional Variation (Sunshine Coast Study Site)
The effects of forest disturbances are readily apparent at the Sunshine Coast intrasubregional study site, which encompasses two land tenures with different land use histories. The canopy heights fall within the CORVEG height ranges but are substantially less than Sunshine Coast virtual plots, and the canopy height in Quinlans averages 6.2 m below the adjoining Mapleton National Park (Figure 7 and Table 10) but still falls within the CORVEG parameters (e.g., range and standard deviation from the mean). Pairwise differences between CORVEG and the entire study site, and CORVEG and Mapleton National Park, were not significantly different, but all other pairwise comparisons were significantly different (Table 11). Figure 7. Canopy height variations for the Sunshine Coast intra-subregional study area. The study site, with 50 virtual plots, shows similar distribution to CORVEG as a whole, but Quinlans (Q) is substantially lower than CORVEG, and all the datasets are lower than the 32 Sunshine Coast random virtual plots.  lower in height, and we can conclude that Quinlans has a regenerating forest. The difference in canopy heights between the Mapleton National Park and Quinlans is illustrated in Figure 8 using the ALS profile.  Table 4), compared to CORVEG and all the other Sunshine Coast datasets, Quinlans is lower in height, and we can conclude that Quinlans has a regenerating forest. The difference in canopy heights between the Mapleton National Park and Quinlans is illustrated in Figure 8 using the ALS profile. Figure 8. ALS profile (southwest to northeast) across the Sunshine Coast local study site illustrating the forest canopy height differences between the regenerating forest (Quinlans) and mature/old growth forest (Mapleton National Park). Rainforest area is excluded from our research. Height information from Table 8. Transect length 1300 m, vertical exaggeration 7×.

Clustering and Sample Map Production
The Sunshine Coast study site and two polygon groups in the Koala Coast were selected to evaluate clustering algorithms. We chose these because; (a) we have detailed disturbance history and field data for the study site; (b); the two Koala Coast groups represent higher and lower canopy cohorts; and (c) the three polygon groups are representative of compact, linear, and complex shapes which might, potentially, influence clustering algorithms in different ways. The three groups are shown in Figure 9, coloured by the interquartile ranges (IQR) for each group of cells, with median heights for each IQR shown. By visual inspection we identified six distinct height canopy areas, marked A-E (higher canopy), and an ecotone identified during fieldwork marked F (moderate canopy height) in Figure 9.
For 90 output cluster maps, an expert panel of three visually evaluated each map to assess how well the algorithm identified the six nominated features for individual polygons, results are summarised in Table 12. Elbow plots (see Appendix A) show the class variance explained by each algorithm, indicate the appropriate number of clusters, and algorithm efficiency. The complete evaluation process is detailed in Appendix A. Figure 8. ALS profile (southwest to northeast) across the Sunshine Coast local study site illustrating the forest canopy height differences between the regenerating forest (Quinlans) and mature/old growth forest (Mapleton National Park). Rainforest area is excluded from our research. Height information from Table 8. Transect length 1300 m, vertical exaggeration 7×.

Clustering and Sample Map Production
The Sunshine Coast study site and two polygon groups in the Koala Coast were selected to evaluate clustering algorithms. We chose these because; (a) we have detailed disturbance history and field data for the study site; (b); the two Koala Coast groups represent higher and lower canopy cohorts; and (c) the three polygon groups are representative of compact, linear, and complex shapes which might, potentially, influence clustering algorithms in different ways. The three groups are shown in Figure 9, coloured by the interquartile ranges (IQR) for each group of cells, with median heights for each IQR shown. By visual inspection we identified six distinct height canopy areas, marked A-E (higher canopy), and an ecotone identified during fieldwork marked F (moderate canopy height) in Figure 9. , and complex (c) shapes, with interquartile ranges (IQR) and interquartile median height. We used these maps to assess clustering algorithms. Higher-canopy spatial patterns are evident, for e.g., patches of higher canopy (A-E, light blue), and a canopy area of moderate height (F, dark green, some light green and light blue), surrounded by canopy with slightly lower height (light green). Heights appear more randomly distributed in other areas (light green, dark green) except for areas of lowest canopy height (dark blue). All maps use the same scale.  Figure 10) were captured by each algorithm, each with 2, 3, and 4 classes. Panellists first independently allocated scores, disagreements were highlighted, and the panellists re-evaluated their scores. Key: P-feature identified (1), not identified (0); A-feature has similar area (1), smaller or larger (0); C-feature contiguous (1), not contiguous (0), 's' prefix denotes spatially constrained algorithm. 1 1 1 = panellists agree that feature recognised by algorithm using all Figure 9. Polygon groups representing compact (a), linear (b), and complex (c) shapes, with interquartile ranges (IQR) and interquartile median height. We used these maps to assess clustering algorithms. Higher-canopy spatial patterns are evident, for e.g., patches of higher canopy (A-E, light blue), and a canopy area of moderate height (F, dark green, some light green and light blue), surrounded by canopy with slightly lower height (light green). Heights appear more randomly distributed in other areas (light green, dark green) except for areas of lowest canopy height (dark blue). All maps use the same scale.
For 90 output cluster maps, an expert panel of three visually evaluated each map to assess how well the algorithm identified the six nominated features for individual polygons, results are summarised in Table 12. Elbow plots (see Appendix A) show the class variance explained by each algorithm, indicate the appropriate number of clusters, and algorithm efficiency. The complete evaluation process is detailed in Appendix A. Figure 10. Nonspatial K-medians (a-c) and nonspatial hierarchical cluster maps (d-f) for individual polygons with 3 height classes. Letters A-F indicate areas of higher canopy as shown in Figure 9. Five of six features were identified, and cluster areas are similar except for feature D (c,f). Feature F was partially captured by K-medians (d).
Three-class cluster maps with the highest evaluation scores are shown in Figure 10. Nonspatial K-means and hierarchical clustering successfully identified five of six features, but cluster areas varied substantially between algorithms for Feature D. Feature F, the forest ecotone community, was only partially captured by K-medians.
Four-class cluster maps are shown in Figure 12. K-medians again ranked highest within the nonspatial algorithms, but several features have less well-defined clusters than three-class maps. Redcap [62] had the highest score from the expert panel, with tightly defined contiguous clusters for all features. Redcap extended the area of feature A, and effectively shows the height difference between Mapleton National Park and regenerating forest in Quinlans (see Figure 4). Redcap also accurately captures the extent of the forest ecotone in Quinlans (Feature F). However, compared to Figure 9 and other cluster maps (Figure 10), Redcap overestimated the extent of one feature (D) and underestimated another (E). Table 12. Visual evaluation by a panel composed of one author (DM), a GIS analyst (DS) and an ecologist (DJ); for maps produced for individual polygons by clustering algorithms. Maps were ranked according to how well six nominated features (A-F in Figure 10) were captured by each algorithm, each with 2, 3, and 4 classes. Panellists first independently allocated scores, disagreements were highlighted, and the panellists re-evaluated their scores. Key: P-feature identified (1), not identified (0); A-feature has similar area (1), smaller or larger (0); C-feature contiguous (1), not contiguous (0), 's' prefix denotes spatially constrained algorithm. 1 1 1 = panellists agree that feature recognised by algorithm using all criteria-0 0 0 feature not recognised using any criteria-Dark grey cells indicate panellists agree with individual criteria, 0/1 one panellist disagrees. For all cells, there was 86.3% agreement by all panellists. Panellists most often disagreed on whether feature D was captured. Nonspatial K-medians (all classes) has the highest overall score, followed by nonspatial K-means. For 4 classes, Redcap ranked slightly higher than nonspatial K-medians. The panel evaluation for multiple polygon cluster maps produced using all Koala Coast subregional data is summarised in Table 13. Table 13. Visual evaluation of the clustering algorithms using all height data within the Koala Coast subregional group (6007 cells), scored using the same methods as Table 12. Maps were ranked according to how well four nominated features (B-E in Figure A11) were captured by each algorithm, each with 2, 3, and 4 classes. For all cells, there was 84.3% agreement by all panellists, and the panellists disagreed most often on whether the feature area (column A) was captured. K-means and K-medians are ranked equally for 3 classes, but K-medians is ranked higher for 4 classes. K-means and K-medians were ranked equally highly for three classes (Figure 11), but K-medians was ranked higher for four classes. Features D and E were identified, but because features B and C were not high compared to D and E, were assigned to lower canopy height classes. In addition, areas of High Value Regrowth Forest [63] were identified by both algorithms. differences reflecting the land use history of Quinlans compared to Mapleton National Park. Redcap also accurately captures the extent of an ecotone identified during fieldwork (F); this ecotone has a higher proportion of a preferred koala food tree than the surrounding area (dark blue), and which is absent from the national park. Figure 12. Nonspatial K-medians cluster maps for the Group 6 (low canopy) polygon (a,c) and Group 7 (high canopy) polygon (b,d) using all the height data from Koala Coast subregional Groups 1-7 ( Figure 3). Letters B-E indicate areas of higher canopy as shown in Figure 9. Features D and E are clearly identified but B and C are not, because they belong to the second-and third-highest height classes across the wider Koala Coast subregion. B and C are still significantly higher than the lowest height class (a,c, dark blue). Dark blue areas in (a,c) accurately capture areas of Mature Regrowth vegetation [63], outlined in black. Figure 11. Nonspatial K-medians cluster maps for the Group 6 (low canopy) polygon (a,c) and Group 7 (high canopy) polygon (b,d) using all the height data from Koala Coast subregional Groups 1-7 ( Figure 3). Letters B-E indicate areas of higher canopy as shown in Figure 9. Features D and E are clearly identified but B and C are not, because they belong to the second-and third-highest height classes across the wider Koala Coast subregion. B and C are still significantly higher than the lowest height class (a,c, dark blue). Dark blue areas in (a,c) accurately capture areas of Mature Regrowth vegetation [63], outlined in black.  Figure 11. Nonspatial K-medians (a-c), nonspatial hierarchical (d-f) and Redcap [64] spatial hierarchical (g-i) cluster maps for individual polygons (refer Figure 9) with 4 height classes. Letters A-F indicate areas of higher canopy as shown in Figure 9. All six features are identified, but features A, D, and F are not contiguous. In comparison, the spatially constrained algorithm Redcap (g-i) identifies all the features as contiguous but assigns C, D, and F to a lower height class. Compared to Figure 9, Redcap overestimates D and underestimates E; A is also larger, because it also captures height Figure 12. Nonspatial K-medians (a-c), nonspatial hierarchical (d-f) and Redcap [64] spatial hierarchical (g-i) cluster maps for individual polygons (refer Figure 9) with 4 height classes. Letters A-F indicate areas of higher canopy as shown in Figure 9. All six features are identified, but features A, D, and F are not contiguous. In comparison, the spatially constrained algorithm Redcap (g-i) identifies all the features as contiguous but assigns C, D, and F to a lower height class. Compared to Figure 9, Redcap overestimates D and underestimates E; A is also larger, because it also captures height differences reflecting the land use history of Quinlans compared to Mapleton National Park. Redcap also accurately captures the extent of an ecotone identified during fieldwork (F); this ecotone has a higher proportion of a preferred koala food tree than the surrounding area (dark blue), and which is absent from the national park.

Discussion
Current koala habitat mapping uses regional-scale vegetation maps incorporating a high degree of floristic variability, which may limit their application at the local scale (1-300 hectares), where tree species variability, i.e., food resource availability, becomes more apparent. Since koalas prefer larger trees [13,16], incorporating this variable into a map overlay might be useful for habitat management at this scale [3]. Our study was aimed at how best to incorporate larger trees into the current habitat maps using heights obtained from ALS. To derive the parameters for an overlay map, we used published information as a baseline and found that, at the regional scale, forest canopy heights are within published parameters [53] but, at the subregional and intra-subregional scales, the height variability is generally outside these parameters. Additionally, the variability is both significantly different between subregions and within subregions. Two height parameters demonstrated the effects of previous land use. In regenerating forests, not only is the canopy height lower, but the canopy has a lower standard deviation in height, i.e., the canopy is "flatter", with shade-tolerant and shade-intolerant species in active competition [17]. The range in the standard deviation therefore has potential for identifying similar areas of regenerating forests.
To incorporate ALS canopy height information into a habitat map overlay, we examined several clustering algorithms and found that nonspatial K-medians best characterised the nominated high-canopy areas for multiple polygons at the intra-subregional scale. This approach might be useful for height comparisons between different areas for broader management purposes but produced clusters suffering from a lack of contiguity, limiting their use at finer scales. At the site scale, i.e., individual habitat map polygons, the spatially constrained hierarchical clustering algorithm Redcap [61,62] was able to: (a) provide greater map utility by identifying contiguous clusters; (b) separate historically disturbed, lower canopy height forests from relatively undisturbed higher canopy forests, and (c) capture a forest ecotone with a higher proportion of preferred koala food resources. Redcap further showed its utility in another polygon by capturing an area independently mapped as Mature Regrowth Forest [63]. We used algorithms requiring no user input so that our approach could be automated, allowing any number of forest map polygons to be clustered at any desired scale. The number of algorithms we used, and the consistent results between many of them, confirmed that most of our visually nominated features were, indeed, real clusters of similar heights and that our algorithm recommendations were valid. This multifaceted approach was recommended by Ketchen and Shook [64], who warned against adopting any single clustering algorithm. However, we noted that two of our nominated features were not recognised as being statistically significant by any algorithm, which shows that arbitrary classifications of canopy height (a) may not be meaningful and (b) may not be repeatable.
The previous use of multivariate modelling that included ALS-derived canopy height as a variable has shown that this metric features prominently within these analyses, for example, by using multivariate K-means clustering to portray forest structures, one study found that most landscape indices were rendered redundant by the median and standard deviation of the canopy height [65], i.e., ALS provides directly measured predictor variables and improves the accuracy of maps compared to inferred variables. In Alberta, Canada, a cluster analysis with ALS variables (canopy height, cover, and height variation) was used to identify areas of high conservation values within mapped forest polygons at the regional scale [66]. In another example, a Northern Spotted Owl (Strix accidentalis caurina) nesting habitat in Oregon, USA, mapped using traditional approaches (aerial photo interpretation, satellite imagery, and environmental predictor variables), was substantially improved with the addition of the ALS-derived large tree density and average stand height [67], and the importance of height was confirmed by another study in Northern California [68]. The maximum canopy height was found to be the best predictor of Spotted Owl habitats in Northern California (Strix occidentalis) [69]. In The Netherlands, a study found that, for species distribution models, the ALS-derived canopy height was a better predictor of bird species presence/absence than a suite of land cover variables [70] and had advantages at multiple scales where species habitat requirements are either not known or vary according to particular needs (e.g., feeding, breeding, and dispersal) [71].
We have recommended the use of Redcap with four classes for individual polygons, but to further maximise the overlay map utility at broader scales, for example, in conservation planning, we recommend that a height overlay also includes height classes derived from K-medians. For at-risk species dependent on large hollow-bearing trees, such as the greater glider (Petauroides volans) [72], this multiscale approach might be more appropriate. Koala researchers have previously stratified field sites based on low-resolution vegetation community types and/or geology maps [18]; sampling strategies using height clusters could reduce field efforts by effectively sampling areas of mature forest, regrowth, or ecotones within these broader map units. A similar approach using LiDAR has been used to increase the efficiency of survey plot sampling designs in pine plantations [73].
In Queensland, the distinction between Remnant and Regrowth vegetation is important, because, e.g., Remnant Vegetation [54] is subject to stricter legislation governing land use compared to Regrowth or other undefined forest; therefore, it is necessary to accurately differentiate between these forest types. We identified potential mapping errors within our sample forest communities by identifying areas below the minimum height required for classification as Remnant Vegetation [54]; the application of our methods could highlight these mapping errors so they can be corrected. During our research, we noted gaps and overlaps between the Remnant and Mature Regrowth datasets, and because we successfully captured mapped areas of Mature Regrowth [63], our methods could also be used to rectify these mapping errors.
We acknowledge that our study was limited in scope, with the Sunshine Coast and Western areas having only approximately 30% LiDAR coverage. As time goes on, we can expect greater LiDAR coverage, not only from ALS but from newer LiDAR platforms, e.g., the recently commissioned Global Ecosystem Dynamics Investigation (GEDI) satellite [74]. Our study only examined one regional vegetation community and assessed three community polygons with clustering, so our results may not be applicable to all forest communities. However, by choosing a floristically variable community with a likely corresponding variability in canopy height, and with disturbance information for one polygon and partial disturbance information for another, we regard our study as a valuable contribution to this topic.

Conclusions
In this study we showed that, because forest canopy heights vary at the subregional and intra-subregional scales, height overlay maps need to incorporate this variability. A regional framework provides an overall picture of the forest canopy height, but at the subregional scale and intra-regional scale relevant to habitat management, the overlay can incorporate attributes derived from the canopy height at these scales. This approach can be implemented over broad areas to produce a multiscale map that can be used to capture an important koala habitat attribute, i.e., that koalas prefer larger trees at any given location.
For koalas, which constantly face development pressure, habitat maps depicting areas of large trees could provide valuable assistance in identifying valuable habitats for protection. Our clustering approach could provide unambiguous and independent tools to assess development at both the site and local government authority scales, reducing the reliance of development approval authorities on information supplied by development proponents. Other potential uses include the identification of candidate old-growth forests, mapping areas of larger hollow-bearing trees required for fauna, conservation reserve planning, and improvement in the spatial accuracy of forest community mapping. , and complex (c) polygon groups, showing the interquartile ranges (IQR) and interquartile median canopy heights within each group. We used this map to assess and compare the clustering algorithm outputs which follow. Spatial patterns are evident, for, e.g., patches of the highest canopy (A-E, light blue) and an area of moderate height (F, dark green), surrounded by a canopy with a slightly lower height (light green). The height appears more randomly distributed in other areas, except for lower canopy height areas (dark blue). All maps use the same scale.    Figure A3. Spatially constrained K-means (a-c) and K-medians (d-f) with 2 classes. These algorithms use spatial weights (x, y), which, in all cases, outweigh those of the object (h = height). Letters A-F indicate areas of higher canopy, only one of the designated features (E) is adequately captured using these algorithms (f). Panel scores: K-means = 6 and K-medians = 8. Figure A3. Spatially constrained K-means (a-c) and K-medians (d-f) with 2 classes. These algorithms use spatial weights (x, y), which, in all cases, outweigh those of the object (h = height). Letters A-F indicate areas of higher canopy, only one of the designated features (E) is adequately captured using these algorithms (f). Panel scores: K-means = 6 and K-medians = 8.         . Spatially constrained K-means (a-c) and K-medians (d-f) cluster maps with 4 classes. Letters A-F indicate areas of higher canopy. Again, these algorithms use spatial weights (x, y) that far outweigh those of the object (h = height). As with 3 classes for these algorithms, the only features adequately captured are A (a,d) and E (c,f). Panel scores: K-means = 7 and K-medians = 7. Figure A9. Spatially constrained K-means (a-c) and K-medians (d-f) cluster maps with 4 classes. Letters A-F indicate areas of higher canopy. Again, these algorithms use spatial weights (x, y) that far outweigh those of the object (h = height). As with 3 classes for these algorithms, the only features adequately captured are A (a,d) and E (c,f). Panel scores: K-means = 7 and K-medians = 7. Redcap also recognises F, the unmapped area of Eucalyptus tereticornis (dark green), which is slightly lower in height than the national park canopy but higher than the surrounding forest (dark blue). All algorithms portray C (b,e,h) as slightly lower than B, but the areal extent of D is exaggerated by Skater (i). Skater is the algorithm that best captures the areal extent of E (i). Panel scores: Hierarchical clustering = 14, Skater = 15, and Redcap = 16. Redcap also recognises F, the unmapped area of Eucalyptus tereticornis (dark green), which is slightly lower in height than the national park canopy but higher than the surrounding forest (dark blue). All algorithms portray C (b,e,h) as slightly lower than B, but the areal extent of D is exaggerated by Skater (i). Skater is the algorithm that best captures the areal extent of E (i). Panel scores: Hierarchical clustering = 14, Skater = 15, and Redcap = 16.

Appendix A.3. Multiple Polygon Clustering
Non-contiguous polygons can only be clustered by nonspatial algorithms, i.e., Kmeans, K-medians, and hierarchical clustering. For the Koala Coast area, we selected two polygon groups, one from intra-subregional Group 6 (212 cells, 13.3 ha) and one from Group 7 (1357 cells, 84.8 ha), i.e., a polygon from each of the high-and low-canopy cohorts. We again produced a canopy height map classified by interquartile height ranges (IQR) as a reference point but with height data from 6007 cells from all seven Koala Coast intrasubregional groups ( Figure A11). Some clustering is apparent in the maps-in particular, in Map (b), two areas of higher canopy (D and E), and a section of lower canopy in the southeast. In Map (a), the higher canopy areas (B and C) are differentiated from the lowcanopy height (dark blue, IQR 1). Figure A11 thus identifies the same features as Figure A1 but provides a subregional context, i.e., whilst B and C are higher than the remainder of their polygon group, they are significantly lower than D and E and similar areas across the subregion.

. Multiple Polygon Clustering
Non-contiguous polygons can only be clustered by nonspatial algorithms, i.e., Kmeans, K-medians, and hierarchical clustering. For the Koala Coast area, we selected two polygon groups, one from intra-subregional Group 6 (212 cells, 13.3 ha) and one from Group 7 (1357 cells, 84.8 ha), i.e., a polygon from each of the high-and low-canopy cohorts. We again produced a canopy height map classified by interquartile height ranges (IQR) as a reference point but with height data from 6007 cells from all seven Koala Coast intrasubregional groups ( Figure A11). Some clustering is apparent in the maps-in particular, in Map (b), two areas of higher canopy (D and E), and a section of lower canopy in the southeast. In Map (a), the higher canopy areas (B and C) are differentiated from the lowcanopy height (dark blue, IQR 1). Figure A11 thus identifies the same features as Figure  A1 but provides a subregional context, i.e., whilst B and C are higher than the remainder of their polygon group, they are significantly lower than D and E and similar areas across the subregion. Figure A11. Selected polygon groups from Koala Coast intra-subregional Groups 6 (a, low-canopy cohort) and 7 (b, highcanopy cohort). Letters B-E indicate areas of higher canopy These maps are similar to Figure A1, with features B-E recognisable, but, because they encompass the entire canopy height range within the Koala Coast, B and C (a) are in middle IQRs (interquartile ranges), D and E (b) are still in the upper quartile. Lower canopy heights (b, dark blue) adjoining B and C are pronounced. Cell colours represent the IQRs of all height data within the Koala Coast, i.e., intra-subregional Groups 1-7. IQR median heights are also shown. Figure A11. Selected polygon groups from Koala Coast intra-subregional Groups 6 (a, low-canopy cohort) and 7 (b, highcanopy cohort). Letters B-E indicate areas of higher canopy These maps are similar to Figure A1, with features B-E recognisable, but, because they encompass the entire canopy height range within the Koala Coast, B and C (a) are in middle IQRs (interquartile ranges), D and E (b) are still in the upper quartile. Lower canopy heights (b, dark blue) adjoining B and C are pronounced. Cell colours represent the IQRs of all height data within the Koala Coast, i.e., intra-subregional Groups 1-7. IQR median heights are also shown.

Appendix A.4. Elbow Plots
Our assumption was that two, three, or four cluster class maps would have the greatest utility in terms of user practicality, but we generated maps with 2-10 clusters to allow use of the elbow method ( Figures A15-A17) to estimate the optimal number of clusters [64]. The plots indicate that the elbow position is not clearly demarcated but occurs between two and four clusters; after four clusters, additional clusters explain less and less variance. Nonspatial methods show the greatest consistency between algorithms, and differences between spatial algorithms may be due to different polygon group shapes, i.e., compact, linear, and complex. Appendix A. 4

. Elbow Plots
Our assumption was that two, three, or four cluster class maps would have the greatest utility in terms of user practicality, but we generated maps with 2-10 clusters to allow use of the elbow method ( Figures A15-A17) to estimate the optimal number of clusters (Ketchen and Shook 1996). The plots indicate that the elbow position is not clearly demarcated but occurs between two and four clusters; after four clusters, additional clusters explain less and less variance. Nonspatial methods show the greatest consistency between algorithms, and differences between spatial algorithms may be due to different polygon group shapes, i.e., compact, linear, and complex. Figure A15. Plot of the percentage of variance explained as a function of the number of clusters generated by eight algorithms for the Sunshine Coast study site. This elbow plot indicates that 2-4 classes is the optimum, e.g., for nonspatial algorithms, 3 classes explain 82.7% (K-means), 79.2% (K-medians), and 80.9% (Hierarchical clustering) of the variance, and 4 classes explain 90.3% (K-means), 89.2% (K-medians), and 90.0% (Hierarchical clustering) of the variance. In comparison, spatially constrained algorithms (sc) explain a maximum of 56% (3 classes) and 63% (4 classes) of the variance. Redcap (sc) Figure A15. Plot of the percentage of variance explained as a function of the number of clusters generated by eight algorithms for the Sunshine Coast study site. This elbow plot indicates that 2-4 classes is the optimum, e.g., for nonspatial algorithms, 3 classes explain 82.7% (K-means), 79.2% (K-medians), and 80.9% (Hierarchical clustering) of the variance, and 4 classes explain 90.3% (K-means), 89.2% (K-medians), and 90.0% (Hierarchical clustering) of the variance. In comparison, spatially constrained algorithms (sc) explain a maximum of 56% (3 classes) and 63% (4 classes) of the variance.    Figure A16. Plot of percentage of the variance explained as a function of the number of clusters generated by eight algorithms for the Koala Coast intra-subregional Group 6 polygon group. This elbow plot indicates that 2 to 3 classes is the optimum, e.g., for nonspatial algorithms, 2 classes explain 86.9% (K-means), 86.3% (K-medians), and 84.3% (Hierarchical clustering) of the variance. For spatial algorithms (sc), there is no clear elbow, with the Redcap algorithm explaining a maximum of 33.2% of the variances for 3 classes.   Table A1. Procedure used to rate each cluster map. Binary scoring (yes or no) for each identified feature is designed to discourage subjectivity. Letters P, A, and C refer to codes in Tables 12 and 13.

Criterion Evaluation Score
Is the feature identified? yes 1 P = present no 0 Is the areal extent similar? Within ±20% of the IQR * map 1 A = area More than 25% different to the IQR map 0 Is the feature internally contiguous? all cells joined (edges and/or corners) 1 C = contiguity no better than the IQR map 0 * IQR Interquartile range ( Figures A1 and A11).
The panellists first independently allocated scores, their disagreements were highlighted, and the panellists then re-evaluated their scores. These scores are shown with each of the above figures. The results for all algorithms are shown in Table 12; for nonspatial clustering, K-medians was assigned the highest rank for the two, three, and four classes. In comparison, the spatially constrained algorithms (Hierarchical, Skater, Redcap) performed poorly for the two and three classes but slightly better than the nonspatial algorithms for the four class. No single algorithm was judged to capture all the features, and the panellists had the most disagreement with feature D. The spatially constrained K-means and K-medians performed particularly poorly, identifying only one feature.
For two or more non-contiguous polygon groups, only the nonspatial algorithms (K-means, K-medians, and hierarchical clustering) are suitable clustering methods. For two sample polygon groups ( Figure A11) selected from the highest (Group 7) and lowest (Group 6) canopy height cohorts, we used the entire Koala Coast canopy height model to evaluate these algorithms. Comparing multiple polygon groups within a subregion allows for the generation of multiscale clustering maps. Algorithms with two classes did not capture any features. K-medians captured features D and E for three and four classes and was assigned the highest rank (Table 13). Table A2. Height differences (metres) between the highest cluster median height and next-lowest cluster median height. Spatially constrained K-means and K-medians have the smallest spread between these height classes, while the other spatial algorithms have a larger spread but are still less than the nonspatial methods. n.a. = algorithm not applicable.

Algorithm
Classes Figure