Assessing the Impact of Spectral Resolution on Classification of Lowland Native Grassland Communities Based on Field Spectroscopy in Tasmania , Australia

This paper presents a case study for the analysis of endangered lowland native grassland communities in the Tasmanian Midlands region using field spectroscopy and spectral convolution techniques. The aim of the study was to determine whether there was significant improvement in classification accuracy for lowland native grasslands and other vegetation communities based on hyperspectral resolution datasets over multispectral equivalents. A spectral dataset was collected using an ASD Handheld-2 spectroradiometer at Tunbridge Township Lagoon. The study then employed a k-fold cross-validation approach for repeated classification of a full hyperspectral dataset, a reduced hyperspectral dataset, and two convoluted multispectral datasets. Classification was performed on each of the four datasets a total of 30 times, based on two different class configurations. The classes analysed were Themeda triandra grassland, Danthonia/Poa grassland, Wilsonia rotundifolia/Selliera radicans, saltpan, and a simplified C3 vegetation class. The results of the classifications were then tested for statistically significant differences using ANOVA and Tukey’s post-hoc comparisons. The results of the study indicated that hyperspectral resolution provides small but statistically significant increases in classification accuracy for Themeda and Danthonia grasslands. For other classes, differences in classification accuracy for all datasets were not statistically significant. The results obtained here indicate that there is some potential for enhanced detection of major lowland native grassland community types using hyperspectral resolution datasets, and that future analysis should prioritise good performance in these classes over others. This study presents a method for identification of optimal spectral resolution across multiple datasets, and constitutes an important case study for lowland native grassland mapping in Tasmania.


Lowland Native Grasslands
Australia is renowned for the uniqueness and diversity of its flora and fauna, and no exception to this fact is the State of Tasmania, which is home to the world's largest remaining temperate wilderness.Much of the environmental research and ecological conservation efforts within the State are focused on the preservation of temperate rainforests and alpine environments.However, Tasmania is also home to a much lesser known, but equally biodiverse and important, group of vegetation communities known as lowland native grasslands.These native grasslands are found throughout the eastern interior of the State in the prime agricultural area known as the Midlands region [1].Lowland native grasslands are recognised as one of the most threatened groups of vegetation communities across Australia [2], and form part of the Midlands region biodiversity 'hotspot' [1].Under Federal Australian law, 25 of the approximately 750 species found in these communities are protected under the Environmental Protection and Biodiversity Conservation Act (EPBC) [3], and an additional 60 species are listed under the Tasmanian Threatened Species Act [4].However, despite such protections, loss of native grassland communities across the State is a major concern.During the last 200 years since European colonisation, the extent of these communities has decreased dramatically [5].Beeton [1] estimated that the extent of lowland native grasslands has decreased by up to 90%, while Gilfedder [6] estimated the decrease to be approximately 60% from an original estimated extent of 450 km 2 .
Major threats to lowland native grassland communities include agricultural practices, weed invasion, fragmentation, loss of species diversity, and anthropologically induced change to species distribution [1].The largest threat, however, is due to land clearing.Even though clearing of such communities is illegal, the current system for identification of private land holdings with significant remnant patches of lowland native grassland communities is through self-reporting.This is problematic in that it relies on a concerted effort by land-holders to identify, quantify, and report grassland presence on a voluntary basis.The primary land use of lowland native grasslands is grazing ewes and cattle, and the grasslands also act as a valuable source of feed during the drier periods of the year, as well as providing shelter for lambing ewes [3].The nutritional content of these grasses, however, is much lower than that of introduced pasture species such as Lucerne or Rye grass, meaning that native pastures are unable to support as many ewes per hectare as introduced pastures [3].This creates an incentive for landowners to remove native communities in favour of more productive alternatives.
In 2012, the Midlands Water scheme began construction.This scheme is responsible for the transportation of 45,000 million litres of water a year from Arthurs Lake and the South Esk River to the Midlands region for agricultural irrigation [7].As part of the conditions of construction of the Midlands Water Scheme, the Tasmanian Department of Primary Industry, Parks, Water and Environment (DPIPWE) agreed to establish a long-term community monitoring scheme to ensure that there was no loss or degradation of pre-existing lowland native grassland communities within the area [8].The requirement for frequently updateable maps of lowland native grassland community extents has been previously established as a major research need for the State of Tasmania [9].However, this requirement means that significant changes must be made to the way in which vegetation communities are mapped in the State.The current vegetation mapping program for the State of Tasmania-known as TASVEG-is primarily based on manual digitisation of aerial photography [10].This causes several issues in regards to the objectiveness of the generated maps, as well as the ability of the maps to be regularly updated.Given infrequent acquisition times of the required source imagery, the aerial datasets used are quickly outdated, and the same images are often used across multiple updates of the final product [10].
Semi-automated remote-sensing-based approaches provide an attractive alternative to the currently used approaches, particularly given the frequent revisit time of many satellite-borne sensors.Melville et al. [11] undertook a study into the feasibility of using multispectral remote sensing platforms to accurately discriminate both between different types of lowland native grassland communities, and between grasslands and commonly co-occurring vegetation types through a comparison of classification results obtained from Landsat ETM+ and WorldView-2 datasets.The results of this study indicated that accurate discrimination was possible using multispectral data; however, there was some indication of improved discrimination between lowland native grassland communities in the higher spectral resolution WorldView-2 datasets.Given the prevalence of hyperspectral remote sensing approaches in both native and non-native grassland mapping, the authors called for further investigation into whether community composition could be improved through the use of such datasets.

Hyperspectral Analysis of Grasslands
High-Spectral-Resolution (HSR) data has been shown to have advantages over broadband multispectral data in many cases.HSR datasets have been shown to be highly effective in grassland community mapping, particularly when the communities bear significant similarities to one another.For example, Mutanga and Skidmore [12] found that narrowband indices generated from hyperspectral data were able to provide better estimates of grassland biomass than broadband equivalents.Many narrowband indices and isolated regions of the electromagnetic spectrum have additionally been found to be strongly correlated with grassland properties.Significant relationships have been found between levels of dry and wet biomass present in grasslands and spectral reflectance in the region of the spectrum between 350 nm and 450 nm [13].Other authors have also found relationships between the red and NIR portions of the spectrum and differences in grassland biomass [12,14,15].Although the majority of publications using HSR narrowband datasets for grassland analysis focus on prediction of biomass and other biophysical attributes rather than community classification, the findings of such research still provide a valuable framework from which to build community classification approaches.In fact, the findings of such previous research would indicate a strong potential for community differentiation based on biophysical and biochemical properties such as biomass, pigment levels, and water content [13].In addition, consistent issues with multispectral analysis of complex vegetation communities have been identified, highlighting the need for more detailed HSR approaches [16].
Narrowband spectral analysis of communities has also been postulated as a potential source of reliable validation data for broadband multispectral approaches [16].The caveat of such approaches, however, is that there is a definite need for the spatial scale of analysis to be carefully considered and optimised to produce reliable estimates of vegetation parameters [16,17].

Aims and Objectives
The purpose of this study was to determine if the use of HSR narrowband datasets can provide an improvement in class discrimination between lowland native grassland communities and commonly co-occurring vegetation.Key regions of spectral separability between classes were determined based on spectral signatures collected with a narrowband handheld field spectroradiometer and convolution approaches.Comparison between spectral configurations and datasets was then used to determine the optimal spectral resolution required to accurately discriminate between the target communities.

Study Site
The site selected for this study is the Tunbridge Township Lagoon Reserve, located in the Tasmanian Midlands Region approximately 100 km from the capital of Hobart.The Midlands are a seminatural landscape defined by mountainous borders, low altitude, dry climate, broad fertile river valleys, and plains [9].The average annual minimum temperature in the region is 5.7 • Celsius, and the average maximum temperature is 17.7 • Celsius.The average rainfall is 488 mm per annum.The 20 ha reserve is the only designated protected area in the State for the endangered lowland native grassland communities, and has therefore been the target of detailed vegetation studies in the past [18].The site contains a shallow saltwater lagoon, surrounded by native grassland communities.The eastern third of the site contains a small hill intergrading into an open grassy woodland on neighbouring properties.This site was selected as it exhibits excellent examples of major lowland native grassland community types within a small area, and is free from factors that may confound confusion between vegetation classes, such as grazing and fertilisation.

Class Descriptions
For this study, four classes were defined: (1) saltpan; (2) Wilsonia rotundifolia/Selliera radicans; (3) Danthonia sp./Poa sp.dominated grasslands; and (4) Themeda triandra dominated grasslands.These classes are loosely based on the floristic communities identified by [18], who proposed two additional communities-Calocephalus lacteus open grassland and Lolium perenne grassland-which were excluded from this analysis due to insufficient data points and limited observable extent.
The saltpan class covers exposed soil with no vegetative cover, and includes both the lagoon itself and the surrounding mud flats.The Wilsonia rotundifolia/Selliera radicans class covers areas of saltpan where the dominant cover consists of either of these two species, with some cover of Puccinella stricta or Lolium sp.This class is similar to the Puccinella stricta grassland class proposed by [18].The Danthonia grassland class is quite broad, covering any area that contains Danthonia sp. or Poa sp.(such as P. rodwayi or the larger P. labillardieriei) regardless of the intertussock species.The definition of this class is similar to that of the grassland complex class found in the TASVEG community listings, and represents one of the three major lowland native grassland community types along with Poa labillardierei grasslands and Themeda triandra grasslands.This class covers areas around the edge of the lagoon.There is significant variation in intertussock species throughout the class, and intergrading with both the Wilsonia class and the Themeda class is in some places extensive.The main distinguishing factor for this class is that the dominant grass species are all cool season C 3 grasses, exhibiting a typical winter-spring growth period, and a period of senescence over the summer months.The final class-Themeda grassland-represents a typical lowland native grassland community, and covers areas located in the eastern half of the site.Themeda triandra is dominant in these areas, with some dispersed trees and shrubs-primarily Acacia dealbata and Bursaria spinosa.Zacharek et al. [18] theorised that the remnant Themeda communities found on the site originally formed an open grassy woodland, due to occurrence of Eucalyptus ovata specimens on neighbouring properties.The distinguishing feature of this class is that the dominant species is always Themeda triandra, which is a warm season C 4 grass, typically growing in the warmer summer months and entering senescence in late autumn to winter.Figure 1 shows photographs of the four vegetation classes.

Class Descriptions
For this study, four classes were defined: (1) saltpan; (2) Wilsonia rotundifolia/Selliera radicans; (3) Danthonia sp./Poa sp.dominated grasslands; and (4) Themeda triandra dominated grasslands.These classes are loosely based on the floristic communities identified by [18], who proposed two additional communities-Calocephalus lacteus open grassland and Lolium perenne grassland-which were excluded from this analysis due to insufficient data points and limited observable extent.
The saltpan class covers exposed soil with no vegetative cover, and includes both the lagoon itself and the surrounding mud flats.The Wilsonia rotundifolia/Selliera radicans class covers areas of saltpan where the dominant cover consists of either of these two species, with some cover of Puccinella stricta or Lolium sp.This class is similar to the Puccinella stricta grassland class proposed by [18].The Danthonia grassland class is quite broad, covering any area that contains Danthonia sp. or Poa sp.(such as P. rodwayi or the larger P. labillardieriei) regardless of the intertussock species.The definition of this class is similar to that of the grassland complex class found in the TASVEG community listings, and represents one of the three major lowland native grassland community types along with Poa labillardierei grasslands and Themeda triandra grasslands.This class covers areas around the edge of the lagoon.There is significant variation in intertussock species throughout the class, and intergrading with both the Wilsonia class and the Themeda class is in some places extensive.The main distinguishing factor for this class is that the dominant grass species are all cool season C3 grasses, exhibiting a typical winter-spring growth period, and a period of senescence over the summer months.The final class-Themeda grassland-represents a typical lowland native grassland community, and covers areas located in the eastern half of the site.Themeda triandra is dominant in these areas, with some dispersed trees and shrubs-primarily Acacia dealbata and Bursaria spinosa.Zacharek et al. [18] theorised that the remnant Themeda communities found on the site originally formed an open grassy woodland, due to occurrence of Eucalyptus ovata specimens on neighbouring properties.The distinguishing feature of this class is that the dominant species is always Themeda triandra, which is a warm season C4 grass, typically growing in the warmer summer months and entering senescence in late autumn to winter.Figure 1 shows photographs of the four vegetation classes.In addition to the four-class configuration, a simplified three-class class configuration was also used.One of the main issues identified in [11] was confusion between grassland communities (most notably for lowland native grassland complex) and other vegetation with similar phenological cycles and photosynthetic pathways.The three-class configuration was tested to aid in the identification of causal mechanisms of confusion between similar community groups found at the Tunbridge study site.For this configuration, all vegetation with a C 3 photosynthetic pathway (i.e., Wilsonia/Selierria class and the Danthonia class) were merged into a single class.Vegetation with a C 3 photosynthetic pathway composes the majority of all vegetation and is typically found in temperate environments, while C 4 vegetation has evolved to live in more extreme conditions [19].The main distinction between C 3 and C 4 vegetation (such as Themeda triandra) is that during photosynthesis, C 3 vegetation produces a three-carbon acid, known as 3-phosphoglyceric acid, while C 4 vegetation produces a four-carbon product known as oxaloacetate.The remaining two classes in this configuration were saltpan and Themeda triandra grassland.

Data Collection
In November 2015, a field campaign was carried out to collect the data used in this study.ArcGIS 10.3 was used to create a series of sampling sites through the reserve.A 5 m buffer was first applied to the edge of the reserve, and a total of 55 sample sites were created within the buffered zone using the 'Create Random Points' tool.The minimum spacing between points was set to 10 m to ensure no overlap between sites.Data was collected in late November, as the late spring and early summer months are recommended times to undertake vegetation surveys in lowland native grasslands in order to capture the full floristic diversity [20].As the current research need for lowland native grasslands is the establishment of single date, readily updateable community extent maps, only a single data acquisition date was used.At each site, a 5 m × 5 m star transect was created by running a measuring tape north to south and then east to west with the random coordinate of the site situated at the transect centre.Spectral signatures were collected using an ASD Handheld-2 spectroradiometer which collects signatures between 375 nm and 1075 nm in 1 nm increments.Vegetation types were recorded and photographed at the transect centre, and at 2.5 m and 5 m in each compass direction along the transect for a total of 9 observations.At each observation interval along the transect, three spectral signatures were collected at nadir and off-nadir angles in order to record all variation in species composition and structure.Collection of narrowband spectral measurements in this way has been shown to reduce uncertainty in measurements of biological and geographical parameters derived from image spectra [15].The sampling height for the ASD was set to 1 m, giving a 46 cm diameter footprint for each spectral signature based on the 25 • field of view of the spectrometer.All observations were recorded with the operator facing into the sun behind the device to avoid shadowing.The approximate angle for the off-nadir collection was 20 • left and right of the nadir-view angle.The integration time for the spectroradiometer was optimised at each new site, and a dark current subtraction and a white reference observation (of a Spectralon panel) were performed at each observation site along each transect.This was done in order to (i) reduce drift and noise of the spectroradiometer; (ii) reduce potential variation between sites as a result of varying solar angles throughout the day; and (iii) reduce potential issues arising from cirrus cloud cover common to the area.In addition, each reading from the sensor was averaged over 20 observations in order to reduce noise.Figure 2 shows the location of each transect throughout the site, and the general composition of the landscape.Table 1 shows the number of spectral samples collected for each class.

Datasets
A total of four different datasets were derived from the original HSR field data obtained from the ASD handheld spectroradiometer.Upon inspection of the data, significant noise was observed in the 375-400 nm wavelength range, and also in the wavelength range from 900 to 1075 nm.As a result, the original dataset was subset to a range of 400 nm to 900 nm.The first dataset tested used the subset of HSR bands in their original 1 nm increments, resulting in a total of 501 input bands.The second dataset used a variable reduction process to identify redundant bands within the subset 501 band dataset, and was run independently for the three-and four-class configurations.One of the major issues associated with high dimensionality spectral datasets is the occurrence of multiple colinearity between bands [21,22].Due to the high number of input bands, the number of samples required to establish a statistically meaningful result from subsequent image analysis can become exceedingly high [21,23].This problem is commonly referred to as the Hughes phenomenon, or the 'curse of dimensionality'.In order to reduce issues associated with the Hughes phenomenon and multiple colinearity between bands, variable reduction techniques are commonly used.Irisarri et al. [24] found that the most common method used to mitigate such issues within high-spectral-resolution approaches was to first transform the data, and then run a feature selection protocol in order to identify and remove data redundancies.Clevers et al. [23], however, warned against the transformation of data before feature selection, as when data has been transformed, the ability to interpret outputs within a physical and environmental context is lost.In this case, it was decided to omit the data transformation stage, as the ability to relate class reflectance properties to plant biophysical characteristics was highly desirable.Therefore, only a feature selection protocol was applied.
Many authors have proposed methods for reducing data redundancy through feature selection

Datasets
A total of four different datasets were derived from the original HSR field data obtained from the ASD handheld spectroradiometer.Upon inspection of the data, significant noise was observed in the 375-400 nm wavelength range, and also in the wavelength range from 900 to 1075 nm.As a result, the original dataset was subset to a range of 400 nm to 900 nm.The first dataset tested used the subset of HSR bands in their original 1 nm increments, resulting in a total of 501 input bands.The second dataset used a variable reduction process to identify redundant bands within the subset 501 band dataset, and was run independently for the three-and four-class configurations.One of the major issues associated with high dimensionality spectral datasets is the occurrence of multiple colinearity between bands [21,22].Due to the high number of input bands, the number of samples required to establish a statistically meaningful result from subsequent image analysis can become exceedingly high [21,23].This problem is commonly referred to as the Hughes phenomenon, or the 'curse of dimensionality'.In order to reduce issues associated with the Hughes phenomenon and multiple colinearity between bands, variable reduction techniques are commonly used.Irisarri et al. [24] found that the most common method used to mitigate such issues within high-spectral-resolution approaches was to first transform the data, and then run a feature selection protocol in order to identify and remove data redundancies.Clevers et al. [23], however, warned against the transformation of data before feature selection, as when data has been transformed, the ability to interpret outputs within a physical and environmental context is lost.In this case, it was decided to omit the data transformation stage, as the ability to relate class reflectance properties to plant biophysical characteristics was highly desirable.Therefore, only a feature selection protocol was applied.
Many authors have proposed methods for reducing data redundancy through feature selection processes [21,23,25,26].In this case, the dataset was run through the GeneSrF package in R. GeneSrF is a variable reduction protocol that utilises random forests as a means of identifying nonredundant variables with strong predictive capabilities [27].The package was originally designed for gene selection analysis; however, the method is capable of identifying redundant variables in any dataset for which class predictions based on a large number on numeric variables are required.A key strength of random forest models is that they can provide estimates of variable importance for the resulting model derived from the training data [28]; however, the variables identified as having high importance values are often strongly correlated [29], which must be considered in the final interpretation of results.
The GeneSrF approach works by first iteratively excluding a predetermined percentage of variables used in the previous iteration with the lowest importance scores (typically 20%).This process is repeated until all trees are fitted to the dataset [27].The trees are then examined, and the tree with the smallest number of included variables that still has an out-of-bag error estimate lower than a user-determined threshold between 0 and 1 is identified.In this study, the value was kept at the default of 0.1.The variables used in this tree are then extracted and used in the reduced model [27].The protocol was run independently over the subset 501 band HSR dataset for both the three-class and four-class versions of the dataset.A total of 106 spectral bands were identified by GeneSrF as belonging to the optimal model for the three-class configuration, and 86 spectral bands for the four-class alternative.
In order to determine if high-spectral-resolution input data is required to differentiate between lowland native grassland communities, or whether broadband spectral approaches are sufficient, the subset 400-900 nm narrowband HSR dataset was resampled to match the spectral resolution of the Landsat OLI and WorldView-2 sensor platforms, using a spectral convolution method similar to the spectral resampling workflow in ENVI 5.2 [16,30].Convoluted spectra were used rather than spectra obtained from satellite imagery to ensure that the effects of spatial resolution and observation area were entirely removed from the analysis.In this manner, differences in classification accuracy as a result of spectral resolution can be isolated.Due to the limited spectral range covered by the ASD Handheld 2 spectroradiometer, not all bands were able to be simulated for all sensors.For sensor bands in which there was no recorded field data, the bands were excluded from analysis.For Landsat OLI, only the first 5 bands (430-880 nm) [31] could be emulated, while for WorldView-2, the first 7 bands could be emulated (400-895 nm) [32].The convoluted Landsat OLI and WorldView-2 spectra were then assigned class labels using both the three-and four-class configurations.
Figure 3 shows the mean spectral signature for each class in each of the generated datasets.The three-class and four-class reduced HSR dataset values are provided on separate plots as the two datasets have different optimal band selections.The similarity between the observed red and green reflectance-most notably for the Themeda grassland, which has a highly characteristic red appearance-is related to the phenological staging of the communities at the time of data acquisition, and the high amount of dry biomass (as shown in Figure 1).

Classification of Spectra
Classification of the various spectral datasets was undertaken using a random forest (RF) approach similar to that used in [11].Due to the relatively low number of samples available for each class, and high spectral variability within classes, a k-fold cross-validation approach was employed.In the k-fold validation approach, classification is undertaken multiple (k) times based on different random splits of the reference dataset [33][34][35].For this study, a series of 30 random subsets were produced from the original dataset using the scikit-learn module in Python using a 66% training to 33% validation distribution [36].
Each of the 30 subsets was used to train, classify, and validate an RF model based on each of the datasets using both three-and four-class configurations.This resulted in a total of 8 classification results, drawn across 30 repeat classifications each.For each of the results, the number of variables to try was set equal to the square root of the number of input bands [28,37] and the number of trees was set to 1000.For each of the output results, training and validation accuracies were averaged across the 30 repetitions and reported for each class.Variable importance measures were also derived from the RF training models, and averaged for each result.

Classification of Spectra
Classification of the various spectral datasets was undertaken using a random forest (RF) approach similar to that used in [11].Due to the relatively low number of samples available for each class, and high spectral variability within classes, a k-fold cross-validation approach was employed.In the k-fold validation approach, classification is undertaken multiple (k) times based on different random splits of the reference dataset [33][34][35].For this study, a series of 30 random subsets were produced from the original dataset using the scikit-learn module in Python using a 66% training to 33% validation distribution [36].
Each of the 30 subsets was used to train, classify, and validate an RF model based on each of the datasets using both three-and four-class configurations.This resulted in a total of 8 classification results, drawn across 30 repeat classifications each.For each of the results, the number of variables to try was set equal to the square root of the number of input bands [28,37] and the number of trees was set to 1000.For each of the output results, training and validation accuracies were averaged across the Remote Sens. 2018, 10, 308 9 of 19 30 repetitions and reported for each class.Variable importance measures were also derived from the RF training models, and averaged for each result.
One of the most important aims of this study was to determine whether there are significant differences in classification accuracy for the analysed lowland native grassland communities based on the input spectral resolution of the dataset.Therefore, to determine whether such effects exist, a series of analysis of variance (ANOVA) tests were undertaken.Differences in mean classification accuracy were tested within classes across the range of datasets, and additionally between class configurations within each dataset.

RF Training Accuracies
Average training accuracies acquired for all three-class results are summarised in Table 2. Percentages are obtained by averaging per-class results across all 30 trials.There is a high degree of similarity in class results across the different datasets.Mean accuracies for the merged C 3 class in particular are extremely close across datasets.Standard deviation values are small for all results, at approximately ±1%.All class accuracies are high, with the lowest accuracy being 85.2% for the C 3 class in the full HSR, reduced HSR, and Landsat OLI results, and for the Themeda class in the Landsat OLI result.
Table 3 reports the average RF accuracies obtained from the four-class trials.Accuracies for these are more variable than for the three-class results.The full HSR model obtains the highest accuracies for all vegetation classes, and additionally has the lowest standard deviation.Standard deviations have increased over the three-class results, most noticeably for the two classes composing the C 3 class-Wilsonia and Danthonia.The Danthonia class has poor results in all datasets, while other classes have comparable or only slightly decreased accuracies from the three-class results.770 nm and 890 nm portions of the spectrum.The two broadband results show contrasting selections again, with the Landsat OLI results showing high importance in Band 4 (640-670 nm) for all vegetation classes, before a decrease in importance for Band 5 (850-880 nm).The WorldView-2 results shows a peak of importance in Band 5 (630-690 nm), and Bands 1 (400-450 nm) and 7 (770-895 nm) were also identified as important.

RF Variable Importance Measures
Remote Sens. 2018, 10, 308 10 of 19 WorldView-2 results shows a peak of importance in Band 5 (630-690 nm), and Bands 1 (400-450 nm) and 7 (770-895 nm) were also identified as important.For the four-class tests, highly important variables for all results are very similar to those selected in the three-class trials.Key spectral regions are again associated with similar wavelengths, such as at 400 nm, 550 nm, and 675 nm in the full HSR results.The reduced HSR results again show increased importance in the near-infrared region (800 nm).

Final RF Classification Accuracies
Table 4 shows the final mean classification accuracies for the full HSR three-class results.Resulting accuracies are good for all classes, with consistent values obtained between datasets.Full confusion matrices are provided in Appendix A. Classification results for all four-class tests are reported in Table 5. Results for the saltpan are similar to those obtained in the three-class results, however the Danthonia and Wilsonia classes have reduced accuracies compared with the three-test equivalents.

ANOVA Results
The ANOVA test undertaken to determine whether class-specific classification accuracies vary based on the dataset indicated that for the three-class results, there was no significant difference in classification accuracy for any class.This means that there is no significant reduction or improvement in classification performance between datasets based on spectral resolution in the three-class configuration.For the four-class tests, however, it was found that the Themeda, Danthonia, and overall classification accuracies did have statistically significant variations in accuracies based on the input dataset, at p ≤ 0.05.The results of Tukey's post-hoc comparisons indicated that for the Themeda class, the reduced HSR result had a statistically significant increase in classification accuracy with an average increase of approximately 1.5% to 3%.For the Danthonia class, and the overall classification accuracy, it was found that the Landsat OLI and WorldView-2 accuracies were significantly lower than the accuracies obtained in the full HSR results by an average of 2% to 3%.There were, however, no other significant differences detected between other datasets, indicating that the performance of the Danthonia class was not improved by the variable reduction protocol.
The second set of significance tests undertaken was used to determine whether differences between class means could be detected within the results of each dataset.The results indicated that the Landsat OLI, WorldView-2, and full HSR datasets exhibited statistically significantly higher classification accuracy for the combined C 3 class over both the Danthonia and Wilsonia classes used in the four-class tests.The mean increase in accuracy for the C 3 results over the Wilsonia results across the four datasets is 5% to 7%, and the mean accuracy difference between the C 3 class and the Danthonia class is 25% to 28%.Additionally, the accuracy of the Danthonia class was significantly poorer than the accuracy of the Wilsonia class in all of the above datasets (with a mean difference of between 22% and 24%).For the reduced HSR tests, the Themeda class from the four-class model had statistically significant higher accuracy than the three-class equivalent with a mean difference of 4.5%.Overall accuracy across all classes, however, was determined to be higher for the three-class results than four-class accuracy, with an average mean increase of approximately 7%.

Training and Classification Accuracies
Mean training and classification results obtained for all combinations of classes and datasets show high degrees of similarity.Class-specific accuracies and standard deviations typically vary from the training accuracy achieved within a dataset by approximately 1%.The observed similarities in accuracy indicate that the sampling protocol is robust, and that full class variation has been accounted for in both the training and validation stages of classification Final classification accuracies achieved for the range of three-class tests are good, with classes exhibiting consistent behaviour across the four datasets.Accuracies are very similar within classes for the various datasets, with standard deviations also showing highly consistent values.Confusion rates are similarly consistent across the range of tests.The primary source of misclassification is confusion between the C 3 and Themeda classes.The rate of misclassification is slightly higher for Themeda points being misclassified as C 3 than for C 3 being misclassified as Themeda.Confusion within the saltpan class is negligible, with very few points in either of the other two classes being wrongly attributed to this class.
The results for the four-class tests also exhibit consistent levels of accuracy across the range of datasets; there is slightly more variation observable in the three vegetation classes, but such differences in accuracy are only in the order of ~3%.Confusion between classes is similar for both the three-and four-class results.Classification accuracy for the Danthonia class was poorer than for the other classes, ranging from 57% for the Landsat OLI tests to 59.8% for the full HSR tests.The cause of these poor results appears to be due to consistent confusion with the Themeda class.For each of the four datasets, approximately 25% of all Danthonia validation and training spectra were consistently identified as Themeda across the 30 classifications.Confusion with the Wilsonia class is also a significant contributor to the poor accuracy of the Danthonia class, with approximately 12% of points being misclassified in this manner.The higher rate of confusion between Danthonia and Themeda is likely due to similarities in canopy structure between the two classes, as both exhibit erectophile morphology that is not present in the Wilsonia class.The primary physiological similarity between the Wilsonia and Danthonia classes is the shared photosynthetic pathway, which is expressed spectrally as higher levels of greenness than observed in the C 4 Themeda class due to differences in phenological staging.Confusion for the Wilsonia class is almost exclusively with the Danthonia class, with a very few observations being misidentified as Themeda.Confusion in the Themeda class is consistently with the Danthonia class, again likely as a result of similar canopy structure.

Variable Importance Measures
As shown in Figure 4, for the three-class tests, prioritisation of spectral regions is similar between the four datasets.Key wavelengths identified are at 400 nm, 550 nm, 675 nm, and 900 nm.For the two HSR tests, the reduced datasets had higher importance levels than the full HSR dataset with wavelengths 675 nm, 780 nm, and 890 nm being identified as highly important.The full HSR model shows lower importance values for longer wavelengths than those observed in the other three-class results.The reduced model has high numbers of variables selected in the regions surrounding 400 nm and 675 nm, indicating the presence of uncorrelated information contained within these bands.The smaller number of variables selected by the reduction process in the remaining key regions are, however, more important overall to the classification models.In the broadband Landsat OLI and WorldView-2 results, the most important bands cover similar spectral regions as those identified in the HSR results.For the Landsat OLI result, the most important band is Band 4 (640-670 nm), followed by Band 5 (850-880 nm).For the WorldView-2 result, the most important band was Band 5 (630-690 nm), followed by Band 7 (770-895 nm) and Band 1 (400-450 nm).The Landsat OLI result is the only result that does not indicate localised high importance levels in the shorter wavelengths, as the sensor does not cover the 400-450 nm region, while all other sensor configurations do.
Within the three-class results, class-specific variable importance measures indicate the presence of key regions of separability.The Themeda class exhibits much higher importance compared to the C 3 class in all regions.The area in which this difference is most evident is in the shortwave 400-450 nm region, in which the C 3 class exhibits only a slight increase in importance, while the Themeda class shows a clear increase in the reduced HSR results; the Themeda class has clearly higher importance levels at 550 nm and 680 nm.These spectral regions are strongly associated with plant pigment levels, most notably carotenoids in the 400-450 nm region [38], and anthocyanin at approximately 550 nm [39].Increased reflectance in regions associated with such pigments is a good indicator of vegetation senescence [40].Given the varying phenological staging of the two classes as a result of their different photosynthetic pathways, these associations are not unexpected.The reduced HSR results and Landsat OLI and WorldView-2 models all indicate strong variable importance at 890 nm for all classes.This region is known to be associated with water content [13], which differs between vegetation types as a result of phenological differences and morphology.
The variable importance results obtained for the four-class tests identify the same spectral regions as being of key importance as do the three-class results.Band selections are almost identical for the reduced three-and four-class results, although fewer bands around 680 nm were selected for the four-class results.In all of the datasets, the Themeda class again shows high importance values at 400-450 nm, as well as at 550 nm in the two HSR trials.The Wilsonia class shows high importance in the broadband Landsat OLI and WorldView-2 trials in the same bands as identified as important for the C 3 class; however, the degree of importance is dramatically higher than in the three-class equivalent tests.The Danthonia class shows low importance across the range of spectral regions, although localised peaks can be detected at 775 nm and 890 nm within the reduced HSR result, in Band 1 (400-450 nm) and Band 5 (630-690 nm) for the WorldView-2 result, and in Band 4 (640-670 nm) and Band 5 (850-880 nm) in Landsat OLI.The selection of the same spectral regions as identified in the three-class tests indicates that the same plant biophysical properties as discussed previously are likely to be the key drivers of class separability in both the four-and three-class results.

ANOVA and Tukey's Post-hoc Comparisons
As each dataset was classified multiple times through the use of a k-fold cross-validation approach, statistical measures indicating the significance of differences in classification accuracy observed between datasets and classes can be produced.The production of such measures is important to this study, as it allows for accurate determination of the spectral resolutions required to accurately classify lowland native grassland communities, as well as the identification of classes that are difficult to separate based on spectral properties alone.
For the three-class tests, the ANOVA analysis indicated that there were no statistically significant differences in classification accuracies between datasets.This means that no class has improved or reduced mean accuracy when classification is performed based on the spectral configuration of the dataset.Therefore, it can be concluded that the two vegetation classes-C 3 and Themeda-can be differentiated with the same degree of accuracy using both HSR and broadband datasets.The results obtained indicate that HSR datasets may provide little improvement in lowland native grassland community differentiation over broadband multispectral approaches under these conditions of a simplified class configuration.For the four-class results, however, the ANOVA analysis identified several differences in class accuracy across the range of datasets.For the Themeda class, a statistically significant improvement in accuracy within the reduced HSR result was detected, although it is worth noting that the detected increase in accuracy is small (2-3%).Additionally, the Danthonia class was identified as having statistically significant lower accuracies in the Landsat OLI and WorldView-2 results compared to the reduced HSR results.No difference in accuracy was detected between the full and reduced HSR results in this case.Overall classification accuracy for all classes followed a similar pattern, with poorer performance in the Landsat OLI and WorldView-2 results compared to in the reduced HSR output.This finding indicates that when the C 3 class is split into Wilsonia and Danthonia classes, there is some evidence that HSR datasets can improve classification accuracy.This finding has strong implications regarding the potential to differentiate lowland native grassland complex patches that were previously identified as difficult to discriminate, and indicates that a combination of HSR spectral data and other ancillary data is likely needed.
When differences in class-specific accuracy were analysed within each dataset, the results based on the Landsat OLI, WorldView-2, and full HSR results were similar.No statistically significant variation in classification accuracy between the three-and four-class versions of the Themeda and saltpan classes, respectively, was identified within these datasets, meaning that differences in final accuracy were statistically nonsignificant based on the class configuration.The results for these three datasets indicate that the combined C 3 class has statistically significant improvements in classification accuracy over the Wilsonia and Danthonia classes with mean increases of 5% and 22%, respectively.The results for the analysis within the reduced HSR model identified that there was a statistically significant increase in classification accuracy for the Themeda class based on the four-class configuration over the three-class alternative of approximately 4%.Additionally, the results indicated that again the C 3 class has statistically significant higher classification accuracy than both the separated Wilsonia and Danthonia classes within this dataset, and that the overall accuracy for the three-class result is significantly higher than the overall four-class accuracy.
Overall, the results of the ANOVA indicate that for the majority of classes, there is no improvement in classification accuracy based on the dataset; however, the application of a simplified three-class model does result in higher classification accuracy.The Themeda class, in contrast to these generalised findings, reaches optimal performance based on a full four-class configuration employed on a reduced HSR dataset.Even though the results indicate that a broadband three-class model is likely to be the best candidate for generalised community differentiation, the inability of such models to accurately discriminate between the Danthonia and Themeda classes is of concern.Optimal model selection should prioritise good performance for lowland native grassland classes over the performance of others.

Sampling Considerations
As the field sampling design used in this study is randomised and based on clustered observations along transects, the number of observations collected for each class is not equal.In the four-class model, a total of 214 Danthonia points and a total of 228 Wilsonia points were identified.When combined in the three-class model, this results in a total of 442 C 3 training points.As RF models automatically set aside 33% of all input training points used to grow a tree for cross-validation [28], this reduces the number of potential training points for both classes significantly, to 153 points for the Wilsonia class, and 143 points for the Danthonia class.In comparison, when 33% of all C 3 training points are withheld, the model is still created using a total of 296 points for the class.The inclusion of fewer points in the training stage of RF classification can result in significantly poorer final outcomes, as the random subsetting process may derive a nonrepresentative sample from the larger input training dataset to grow the tree from.This may result in trees being unable to accurately classify the data in the final stage of analysis.Therefore, the C 3 class has an inherent advantage over the two split classes, potentially resulting in artificially inflated classification accuracies.The occurrence of this issue could be addressed in future studies through the employment of a stratified field sampling protocol, or by randomly selecting a subset of the C 3 class with a similar number of input points to the Wilsonia and Danthonia classes to ensure equal sample sizes.To determine whether the effect of unequal sample sizes has adversely affected the results for the four-class trials, further analysis needs to be undertaken.

Spatial Resolutions
The analysis in this study is focused solely on determining the spectral resolution and class configurations required for accurate community classification, and as a result does not consider spatial resolution and its effects on classification results.As the ASD handheld-2 spectral radiometer has a 25 • field of view, based on a one-metre sampling height, spectra are collected across a circular area with a diameter of 46 cm.Care was additionally taken to ensure that spectra covered patches only consisting of a single class.Even though the results obtained in this study, when taken in conjunction with the results obtained by [11], indicate that broadband approaches to lowland native grassland differentiation are feasible, the comparatively coarse spatial resolution of the Landsat OLI and WorldView-2 sensors may have a significant effect on final accuracy.The mixture of multiple thematic classes within a single pixel will result in significantly higher deviation in class spectral signatures and additional increases in class generalisation.The results for the simulated broadband sensor tests in this paper assume spectral signatures to be representative of only a single class.The results obtained in [11] indicate that classes can be accurately identified using such broadband approaches; however, the classification accuracies that were obtained are much lower than the results presented in this paper.The deviation in accuracy observed for classes between the two studies undertaken thus far is likely to be the result of varying spatial resolutions and the sampling methods used to generate the training and validation datasets.The use of field-based methods for data collection in this case is likely to be more accurate than the method used in [11].Each sample collected in the field was individually visited multiple times, and the land cover class confirmed.This contrasts with the approach used in [11], in which polygon extents were generated based on extrapolation of field observations.The use of a more spatially precise sampling regime is likely to have contributed to the superior accuracies achieved in this study compared to those obtained in previous works.
Further studies may benefit from the incorporation of multitemporal datasets to include information relating to phenological differences between communities.The Sentinel-2 constellation may also provide improvements in classification results due to its higher spectral resolution, finer spatial resolution compared to Landsat OLI, and frequent revisit time.This combination of attributes has potential to produce highly accurate vegetation maps, particularly given the highly fragmented and diverse nature of lowland native grassland communities.

Conclusions
The results of this study provide several findings that can be used to improve future lowland native grassland mapping approaches.Firstly, the results from the series of three-class tests indicate that for generalised class configurations, broadband spectral resolutions are capable of providing high classification accuracies for lowland grassland communities.The results of the ANOVA indicate that there is no statistically significant improvement in classification accuracy when analysis is undertaken using HSR datasets for the three-class configuration.The second finding is that the separation of Danthonia and Themeda grasslands is not feasible based solely on spectral properties.All of the spectral resolutions trialed in this study failed to provide accurate differentiation between these two classes, indicating that future classification results must consider alternate variable sources to provide good results.The inclusion of variables related to habitats, nutrient status, phenology, or structure may provide some improvement on the results obtained here.The improvement in classification accuracy for the merged C 3 class to 85% mean accuracy from 80% mean accuracy for the Wilsonia class and 57% mean accuracy for the Danthonia class was found to be statistically significant with p < 0.05.The third finding of the study is that Themeda triandra communities have statistically significantly higher classification accuracy in the reduced HSR results, in which the mean classification accuracy was 89%, compared to broadband results in which the average accuracy was 86%.Additionally, the Themeda class benefits from the separation of the Wilsonia and Danthonia classes, with a statistically significant improvement in classification accuracy from 84% in the three-class results to 87% in the four-class.This result indicates that the use of HSR datasets is potentially valuable in this case, as even though the accuracy of other classes is not improved over broadband approaches, there is no indication that using HSR datasets results in poorer classification outcomes.The final finding of this study relates to variable importance measures and their relationships to plant biophysical properties.The importance measures obtained across both sets of classes and all datasets indicate key regions of separability being related to pigment levels [38] and water content [13].These associations provide valuable insight into the communities, and can aid in the selection of appropriate spectral ranges for future studies, and in the selection of optimal data collection times.
Overall, the results of this study indicate that classification of lowland native grassland communities using HSR datasets is possible.The findings of this work provide valuable insight and information that can be used to improve future mapping approaches.The results additionally corroborate the findings of previous works in this field, indicating that there are significant issues associated with classification of C 3 grass species, particularly those in the lowland native grassland complex community, and that the incorporation of nonspectral variables is likely to be important for ensuring accurate results.

Figure 2 .
Figure 2. Location of field plots showing transect locations and observation intervals along each transect.The associated vegetation class are for each observation is shown as per the legend.Each point represents the location of three nadir and off-nadir spectral signatures.

Figure 2 .
Figure 2. Location of field plots showing transect locations and observation intervals along each transect.The associated vegetation class are for each observation is shown as per the legend.Each point represents the location of three nadir and off-nadir spectral signatures.

Figure 3 .
Figure 3. Mean spectral signatures for all classes in each dataset.Sub-figures A to E show the behavior of the land cover classes in the various datasets as spectral resolution changes.Landsat OLI and WorldView-2 mean spectra are plotted against the centre wavelength of each respective band.

Figure 3 .
Figure 3. Mean spectral signatures for all classes in each dataset.Sub-figures (A) to (E) show the behavior of the land cover classes in the various datasets as spectral resolution changes.Landsat OLI and WorldView-2 mean spectra are plotted against the centre wavelength of each respective band.

Figure 4
Figure4shows the mean variable importance measures for both the three-and four-class training results.The full and reduced HSR values-shown in the top row of the figure-show patterns of selection for variables in the 400 nm, 550 nm, and 675 nm portions of the spectrum.The reduced HSR results, in contrast to the full HSR results, exhibit increased importance for the bands retained in the

Figure 4 .
Figure 4. Subfigures A to H show the mean RF variable importance for all class configurations across the test datasets.Importance values are reported for each band as an average value across the 30 RF models produced.Higher plot values mean that for the given class, the variable is more important in the classification.

Figure 4 .
Figure 4. Subfigures (A) to (H) show the mean RF variable importance for all class configurations across the test datasets.Importance values are reported for each band as an average value across the 30 RF models produced.Higher plot values mean that for the given class, the variable is more important in the classification.

Table 1 .
Number of training and validation points used per class.Total point count is given in the final column.The C 3 class count is the sum of the Danthonia and Wilsonia counts.

Table 2 .
Average random forest (RF) training accuracy for all three-class results.Accuracies are given as a percentage obtained by averaging the results of all 30 RF training models.Standard deviations are given as a percentage value above or below the mean.HSR denotes 'High Spectral Resolution'.

Table 3 .
Average RF training accuracy and standard deviations for all four-class results, as averaged across the 30 RF models.Accuracies are presented as mean percentages, while standard deviation is presented as a percentage range above or below the mean.

Table 4 .
Average RF classification accuracy and standard deviations for all three-class results, as averaged across the 30 RF classifications.Accuracies are presented as mean percentages, while standard deviation is presented as a percentage range above or below the mean.

Table 5 .
Average RF classification accuracy and standard deviations for all four-class results, as averaged across the 30 RF classifications.Accuracies are presented as mean percentages, while standard deviation is presented as a percentage range above or below the mean.

Table A4 .
Mean confusion matrix for WorldView-2 three-class classification results.Confusion values are given as a count value, and overall accuracies as a percentage with standard deviation.

Table A5 .
Mean confusion matrix for full-HSR four-class classification results.Confusion values are given as a count value, and overall accuracies as a percentage with standard deviation.

Table A6 .
Mean confusion matrix for reduced HSR four-class classification results.Confusion values are given as a count value, and overall accuracies as a percentage with standard deviation.

Table A7 .
Mean confusion matrix for Landsat OLI four-class classification results.Confusion values are given as a count value, and overall accuracies as a percentage with standard deviation.

Table A8 .
Mean confusion matrix for WorldView-2 three-class classification results.Confusion values are given as a count value, and overall accuracies as a percentage with standard deviation.