Categorizing Wetland Vegetation by Airborne Laser Scanning on Lake Balaton and Kis-Balaton, Hungary

Outlining patches dominated by different plants in wetland vegetation provides information on species succession, microhabitat patterns, wetland health and ecosystem services. Aerial photogrammetry and hyperspectral imaging are the usual data acquisition methods but the application of airborne laser scanning (ALS) as a standalone tool also holds promises for this field since it can be used to quantify 3-dimensional vegetation structure. Lake Balaton is a large shallow lake in western Hungary with shore wetlands that have been in decline since the 1970s. In August 2010, an ALS survey of the shores of Lake Balaton was completed with 1 pt/m2 discrete echo recording. The resulting ALS dataset was processed to several output rasters describing vegetation and terrain properties, creating a sufficient number of independent variables for each raster cell to allow for basic multivariate classification. An expert-generated decision tree algorithm was applied to outline wetland areas, and within these, patches dominated by Typha sp. Carex sp., and Phragmites australis. Reed health was mapped into four categories: healthy, stressed, ruderal and die-back. The output map was tested against a set of 775 geo-tagged ground photographs and had a user’s accuracy of > 97% for detecting non-wetland features (trees, artificial surfaces and low density Scirpus stands), > 72% for dominant genus detection and > 80% for most reed health categories (with 62% for one category). Overall classification accuracy was 82.5%, Cohen’s Kappa 0.80, which is similar to some hyperspectral or multispectral-ALS fusion studies. Compared to hyperspectral imaging, the processing chain of ALS can be automated in a similar way but relies directly on differences in vegetation structure and actively sensed reflectance and is thus probably more robust. The data acquisition parameters are similar to the national surveys of several European countries, suggesting that these existing datasets could be used for vegetation mapping and monitoring.


Introduction
Shore wetland vegetation plays an important role in the functioning of lake systems.The ecotone between land and water creates a large variety of microhabitats and the high biomass production of wetland vegetation feeds energy into the food web [1].Wetlands are often the least disturbed areas of a lake and thus can act as a refuge for wildlife in seasons when pressure of human presence on the ecosystem is intensive.Many pelagic or shore species depend on wetlands in general or the presence of a specific type of wetland vegetation for overwintering, reproduction and feeding [2].Some functions of shore wetlands, such as erosion, flood protection and pollution demobilization are also important from an economic point of view [3].Pressure on shore wetlands is increasing in Europe, since shore areas are intensively used for recreation and industry, and global climate change and local pollution also affect these sensitive communities [4].
The vegetation typical for the studied wetlands consists of emergent macrophytes which are large grass-like vascular plants with perennial underground stems and roots that can grow in the submerged sediment.Stalks and leaves are active during the growth season, extending up to several meters above the water surface and dying back in winter [5].These plants can form both monodominant and mixed stands which can sometimes be separated by well-defined borders.The pattern of patches is not constant in time [6], but our understanding of temporal changes is limited due to the restricted availability of species or genus level monitoring (a genus is a category directly above the level of species).

The Conservation Status of Shore Wetlands
The need for species and genus information is enhanced by the fact that many reed-dominated wetlands are in decline in Europe [7], which is probably mainly due to oxygen depletion of the root zone [8][9][10].These areas show signs of stress at the level of the whole ecosystem, including stunted growth, restricted reproduction, encroachment of terrestrial species and the formation of clumps on the boundary to open water [11,12].During this process, the patches previously occupied by reed can be colonized by other species.Monitoring vegetation health and species composition is necessary to understand reed decline and initiate restoration [13,14].While wetland vegetation health can be understood in a change detection context (involving mapping by remote sensing time series), quantitative surveying of the distinctive symptoms of reed stress and die-back [15] allows reed health to be mapped by creating only a snapshot in time.This may provide a less deep understanding of the process but delivers up-to-date information for decision support.Beyond the scope of a single species, a healthy wetland ecosystem is usually a mosaic of patches of different species, where the presence of the natural zonation is a criterion of a functioning wetland habitat [16].Wherever the mapping of different wetland plant genera is possible, the presence of zonation and mosaic structure can also be assessed and used as a proxy of wetland vegetation health [17,18].

Objective
The objective of this study was to develop a methodology for mapping wetland vegetation composition at genus level, and the recognition of the presence as well as health or stress of the dominant species, based only on ALS-derived features calibrated to ground truth data.This includes testing the accuracy of the proposed method on a survey of a major wetland system against independently collected reference data and comparing it to other surveying methods applied to similar targets.Mapping vegetation height and structural parameters is often the focus of ALS-based vegetation surveys, but was not an objective of the current study.

Passive Remote Sensing of Wetland Vegetation
Vegetation monitoring typically relies on collecting field data on the presence, abundance and health of plant species [19].However, in wetlands, difficult access limits the collection of classical botanical data, and airborne [20][21][22][23][24] or satellite imaging [25][26][27] is widely used for monitoring [28].In addition to classical aerial photography, the potential of multispectral and hyperspectral methods for vegetation classification and mapping has also proved successful in many surveys [13,26,27,[29][30][31][32].Nevertheless, passive optical imaging of wetland vegetation has its limitations: the pixels of high spectral resolution images are typically larger that the ALS footprint sizes, and this causes aggregation of the spectral information encountered horizontally, which is difficult to resolve during classification [26].The potential of spectrally based classification to identify different types of vegetation is always controlled by the de facto differences in their reflectance spectra, which can be limited in some cases [32].

Airborne Laser scanning as a Method for Vegetation Surveys
Airborne Laser Scanning (ALS) samples the Earth's surface by measuring the signal travel time of laser pulses between the terrain and the airborne platform.The travel time is directly proportional to the distance, which can therefore be computed [33].A dense set of points in a three-dimensional coordinate system is created from these distances and the position and orientation of the sensor platform.This latter is typically constantly tracked by a synchronized global navigational satellite system (GNSS) and an inertial measurement unit (IMU).ALS is traditionally used for mapping terrain topography [33], exploiting its ability to penetrate vegetation but removing the echoes corresponding to the canopy as they are not informative for terrain modeling [34][35][36].Since the ALS points corresponding to the echoes from the canopy provide a strong representation of the vertical and horizontal structure of the vegetation due to the high sampling density and accuracy, ALS holds potential for vegetation mapping.ALS mapping of vegetation structure is operational in forests [37][38][39], rapidly developing in shrublands [40][41][42][43][44] but applications to riparian vegetation types remain rare [45,46]

Wetland Vegetation Mapping Based on ALS as a Standalone Tool
ALS has sometimes been used as a standalone tool to assess the hydrological roughness of floodplain vegetation for hydrological modelling but this has usually not been extended to a level of detail suitable for species or wetland condition monitoring [47,48].Object-based analysis of derivatives of the point cloud has been successfully used to outline riparian vegetation and streambed extent [49].Multiple wavelength or multi-temporal surveys have proved to contain sufficient information for vegetation classification beyond the level of growth forms: Spartina stands and the sediment accumulation they facilitate in a saltmarsh environment were successfully outlined using two consecutive ALS surveys [50].Dual wavelength ALS has been applied by Collin [51] for mapping saltmarsh vegetation (17 categories) and very high accuracy (92%) was reached by multivariate classification of rasterized spectral and spatial ALS products as pseudo-bands.

Wetland Vegetation Mapping Based on the Fusion of ALS-Derived Data with Other Data
Due to its potential to sample ground elevation even below a canopy [35], the first applications of ALS in wetland ecology were for creating a very detailed digital terrain model (DTM), which was then applied as a background variable map to explain the spatial patterns of different vegetation types classified from airborne true color or infrared orthophotos.Knight [52] used an ALS-derived DTM to calculate tidal inundation patterns in mangrove wetlands and identify potential mosquito habitats.Jenkins [53] also used ALS data for outlining upland swamps and identified vegetation categories within these boundaries on the basis of multispectral satellite data.Morris [54] combined ALS data with multispectral images to map intertidal habitats and evaluate the link between different vegetation categories and elevations above tide level.Gilvear [55] used an ALS derived digital surface model (DSM) as a background dataset to support visual interpretation of hyperspectral data and found that the introduction of vertical structural information in addition to spectral properties increased the accuracy of visual interpretation.
While in the previously listed studies, ALS was used as a background dataset or an introductory step of vegetation classification to delineate areas in focus, rasterized ALS data can also be fused with imaging spectrometer images on a pixel basis.This creates a raster dataset with several channels, some actual spectral bands from the imaging sensor, while others derived from ALS but accessible within the same dataset (pseudo-bands) [56].Several such studies demonstrate a significant increase in wetland classification accuracy compared to using only spectral or ALS data [57,58].In riparian areas of Australian savannahs, the fusion of QuickBird images with several ALS-derived data layers allowed the identification of riparian vegetation by object-based classification [59].Hence, according to recent research papers, the combination of ALS and optical imagery gives the best results in individual species detection.

Enhancing the Information Contained in ALS Point Datasets
Single-band laser scanning data (e.g., 1,064 nm) alone are apparently considered to contain insufficient information to be reliably used for classifying wetland vegetation to health categories and genera.This is because the traditional approach to vegetation mapping from remotely sensed data involves classification on the basis of spectral differences [56].However, several methodological studies show that the information content of ALS data can be enhanced after collection by various processes including intensity correction [60], radiometric calibration [61,62], and dropout modeling [63].
Most commercially available ALS systems record information on the amount of backscattered energy, i.e., the amplitude (often referred to as intensity) and, in case of full-waveform recording instruments, also the echo width.This parameter seems to hold significant information for species determination [64,65].The amplitude of the returning laser pulse depends on the reflectance of the sampled surface, but is modulated by many other factors including (but not limited to) the energy of the laser pulses, the atmospheric transmittance, the angle between the beam and the local surface normal (i.e., the angle of incidence) and the distance between the ALS system and the target.Maps of echo amplitude can be used for visual interpretation of terrain features [65], but the quantitative application of echo amplitude is only possible if it can be calibrated to actually represent the optical properties of the surface [61].Several solutions to this problem have been proposed, ranging from applying a smoothing filter [66] to modeling signal path and local surface normals [62] and combining this with surface reflectivity measured with an active instrument [61].If the radiometric calibration of the data points is reliable enough, single-channel classification methods can be applied to the dataset.
The presence of open water is important for wetland vegetation mapping but it is difficult to map through single-band ALS.A calm water surface is an almost perfect specular reflector and thus only reflects a high amount of radiation back into the sensor when observed at approximately nadir (i.e., the laser beam and the ray of the reflected echo coincide) [63].At other observation geometries the amount of radiation reflected towards the sensor is often too low to be detected by the receiving unit and this is enhanced by the fact that the reflectance of water in the near infrared wavelengths often used for ALS is generally low, so echoes from a water surface are often not recorded at all.Brzank et al. [67] propose a point-based fuzzy classification procedure that calculates the membership weights of the class "water" for each ALS point on the basis of the recorded amplitude, elevation and point density.Höfle et al. [63] demonstrate a method based on the combination of radiometrically corrected intensities and relative positions of ALS points that allows high accuracy identification and outlining of open water areas.This latter solution involves the reconstruction of the missing ALS points (dropouts) that were not recorded due to specular reflection on the basis of the GPS time tag of the points, the scanning rate and the scan pattern.The point cloud is segmented on the basis of the local surface roughness, the density of points with intensity values below a threshold, and intensity variation.

Study Area
Lake Balaton is a large (594 km 2 ) shallow (mean depth 3.3 m) lake in western Hungary, with more than half of the shoreline sustaining reed-dominated wetlands [68] (Figure 1).These are protected under Natura 2000 and the Ramsar Convention.Kis-Balaton is a reconstructed wetland of 70 km 2 slightly upstream of Lake Balaton, with an average water depth of 1.2 m.Changes in the spatial extent of these wetlands were identified on a long-term archive aerial image study [10,69] and the loss of wetland vegetation area since the 1970's has been documented.However, hardly any spatially explicit data exist on the extent of different vegetation types within the wetlands or on the extent of reed stress that forecasts future losses of area.The last field-based botanical survey including non-reed wetland vegetation was carried out in 1987 on Lake Balaton [70] and 1982 on Kis-Balaton [71]; after this, wetland vegetation monitoring has focused on the suitability of reed for industrial use [68].

Categories Used for Vegetation Classification
The categories of the classification were vegetation types selected before the flight based on field experience and knowledge from archive aerial photo interpretation [10,69].The aim was to produce an ecologically relevant map with categories that can be recognized in the field, as well as potentially including all possible land cover types that were present in the littoral zone: Wetland: For the purpose of this study, wetlands are defined in a vegetation ecology sense contrary to other definitions that might be used for geomorphology, hydrology, pedology, etc. Wetlands are areas where the water or groundwater surface is regularly near or above the sediment surface or the soil is regularly fully saturated with water and where the vegetation is mainly composed of emergent macrophytes [5].
Trees/shrubs (abbreviated as trees in the following): Although trees and shrubs can be present in wetlands, separating them into different classes is beyond the scope of this paper, so this category is simply defined as areas where the dominant plants have a branching woody stem.Typically, these are Populus, Salix or Alnus trees, but other species are also present.These plants are expected to be higher than wetland vegetation, and are usually found on slightly elevated patches of dry land on the shore, within or near wetlands.Trees typically produce multiple echoes of the ALS pulse, so both the top of the canopy and the terrain surface can be identified on a vertical profile of the point cloud (Figure 2).Scripus/Schoenoplectus/Typha angustifolia (abbreviated in the following as Scirpus): Although these are also emergent shore macrophytes, their stand structure differs from wetland macrophytes, because they grow at a much lower stem density and although the tips of the leaves emerge from the water, most of their length is submerged.Such areas are typically found on the most exposed edges of wetlands because of their ability to tolerate wave energy.
Water/artificial: This class contains water surfaces and man-made structures.For the purpose of mapping wetland vegetation, water surfaces do not have to be separated from artificially cultivated or covered areas (grazed or mown grasslands, agricultural areas, asphalt and concrete surfaces), bare soil or otherwise unvegetated structures.Water typically has a very low reflectance in the near infrared spectral band where the instrument operates [63,72] similar to tarmac surfaces and railway embankments, while very high reflectance is produced by the dense closed canopies of cultivated fields, the flat surface of mowed grasslands, and gravel or concrete surfaces (Figure 2).Depending on the roughness of the water surface, the observed reflectances are very low (or the points can be completely missing), or extremely high wherever the local water surface is perpendicular to the incoming pulse (on wave slopes and/or at sensor nadir).
The wetland class as defined above is further divided into vegetation classes containing reed stress categories and genera of other species.Classification to species level was not attempted because identification of wetland macrophyte species is usually based on properties of flower structure and leaf veins, and is thus difficult and often uncertain even when done in the field.Different genera mostly represent different forms of growth and are thus relevant categories for habitat identification.
Typha: Typha plants are characterized by their narrow and long leaves, which all grow from the base of the stem, and can reach up to 250 cm of height.The leaves are relatively thick and rigid, so they are usually near-vertical for most of their length.This means that the penetration of the ALS pulse is high in these areas, often reaching the ground or water surface, where most of the pulse energy is lost to low reflectance of water and bare soil or specular reflection from water.This and the dark green color of the leaves means that Typha areas are usually characterized by low ALS reflectance (Figure 2).Typha is very tolerant to anoxic sediment but is sensitive to wave action; therefore it is mostly found in the central areas of wetlands, in water depths between 50 and 100 cm, surrounded on all sides by Carex or reed.It can also form monodominant stands on the open water boundary in sheltered areas or mixed stands with other species.
Carex: Carex plants also have leaves that all sprout from the bottom of the stem but these are less rigid and usually have a curved shape as they bend towards the ground.The canopy height can also reach 200 cm, but this is rare: 50-130 cm is typical.These leaves interlock to form a dense closed canopy which restricts ALS signal penetration and reflects most of the pulse energy (Figure 2).Carex stands are characteristic on the shore side of wetlands, in periodically dry areas with water shallower than 50 cm year round.
Reed: Phragmites australis is the dominant wetland macrophyte of the study area, and is the only Phragmites species known in the area [68,73].The canopy of Phragmites consists of leaves growing in regular intervals along the stem, which can reach a height of up to 4 meters above the water level [74].This means that signal penetration is initially high but the signal rarely reaches the water surface as it gets reflected from the subsequent layers of leaves (Figure 2).The echo amplitude is usually high as the canopy is dense but in some cases (especially on the SW shore of the lake) canopy density can be low enough to allow some penetration and thus loss of energy to specular reflection from water.Reed can grow in a wide range of habitats from dry roadside ditches to several meters of open water but it is typically found on organic sediment accumulating on sheltered stretches of shore.Reeds standing in deep water have high conservation value, because they provide an essential habitat for spawning fish and nesting birds [75].
Healthy reed: A reed stand was regarded during ground truthing and validation as healthy if the stalks were high (above the approximate height of 1.5 m), had an even density with no open water between them, and if the majority of the stalks was vertical.
Die-back reed: Reed areas were categorized as being in a state of die-back if the density and height was very low or if clumps were present [7], separated by open water areas.
Stressed reed: Stressed reed was defined by alternating areas of low and high stalk density [15], with the canopies closing over any open water patches.
Ruderal reed: Ruderal reed areas are those where abundant nutrients and light allow the encroachment of terrestrial species, mainly weeds (e.g., Urtica dioica) or climbing plants (Humulus lupulus, Solanum dulcamara).These are typically found on the shore side of wetlands, around paths and artificial openings and in areas where waves deposit organic debris.

Airborne Laser Scanning Data
The ALS data were collected during the EUFAR AIMWETLAB survey in August 2010, by the NERC (Natural Environment Research Council) Airborne Research and Survey Facility.The detailed rationale and full technical background of the survey is explained in Zlinszky et al. [76].The surveyed area was the shore zone of Lake Balaton and the area of the Kis-Balaton wetland, adding up to 1000 km 2 of total measured area.Kis-Balaton and the larger shore wetlands on the lake were measured with a pattern of parallel strips, but to save flight time, most of the lake was covered by an irregular pattern of strips following the shoreline (Figure 1).A Leica ALS50 sensor operating at 1,064 nm wavelength with a sinusoidal scan pattern was employed.With this instrument a maximum of four echoes can be distinguished for each pulse.The instrument settings and mission parameters were chosen to provide a 1 pt/m 2 point density, 22 cm footprint diameter and ca. 1 km swath width from an elevation of 1,200 m above ground level.Horizontal and vertical point position accuracies were 0.15 and 0.1 meters respectively, according to sensor specifications.Echo amplitudes were modulated by an automatic gain control (AGC) and the AGC and amplitude values were included in the attributes of each point.The dataset was pre-processed by the NERC Data Analysis Node to the level of ASPRS .lasfiles, and erroneous points resulting from atmospheric or multi-path echoes were identified.

Ground Truth Data
During the months before the flight, ground truth polygons were outlined in the field using a differential GNSS receiver (Leica GS 20, Leica Geosystems, Heerbrugg, Switzerland).These polygons were approximately 10 × 10 m areas where the abundance of the main macrophyte species was found to be homogeneous.Water depth, vegetation height, reed health and the abundance of the 17 most frequent species (including macrophytes, submerged and floating-leaved plants and trees) in the study area were recorded on a Braun-Blanquet scale (with 0 for absence and 5 for full monodominant cover) as attributes of these polygons.In order to have a number of reference areas clearly dominated by a single species, some nearly monodominant plots were cleaned of sub-dominant plants by hand clipping.Out of 82 plots altogether surveyed, 46 were monodominant and 36 were mixed, adding up to about 8,000 m 2 of reference data for about 100 km 2 of wetland vegetation within a full surveyed area of ca.1,000 km 2 .A set of 60 control points was also collected, where the dominant genus and its health was registered for quality control using the categories defined above (3.2).In order to facilitate radiometric calibration, the reflectance of an adjacent bright surface (white dolomite gravel parking space, reflectance at 1,064 nm: 53.5 ± 3.8%) and a dark surface (freshly deposited bare topsoil, reflectance at 1,064 nm: 13.8 ± 2.0%) were measured with a spectroradiometer (ASD Fieldspec 3, Analytical Spectral Devices, Boulder, CO, USA) simultaneously with the flight on one of the survey days.Since the flight also involved hyperspectral imaging [76], care was taken to collect data only under ideal atmospheric and illumination conditions.Because of this, it was assumed that the slight variability of atmospheric conditions during the flight period of ten days is negligible and the calibration constants calculated on the basis of the reflectance measurements are valid for the whole survey.

Visualization and Quality Control
ALS point clouds were visually inspected in FugroViewer (Fugro Inc, Leidschendam, The Netherlands) in planar and profile views (Figure 2).After calculation of elevation rasters using moving planes interpolation in OPALS software [77] the remaining elevation differences in the overlapping areas created by the flight pattern were mapped.The errors in the calibration of the sensor system (misalignment) resulted in different elevations (in the range of up to 10-50 cm) of the same areas.These could not be resolved due to the relatively small overlapping area of strips, therefore it was decided to continue on the basis of individual strips and exclude absolute elevation of points from the classification scheme.Small variations in ground sampling density (<10%) were also present between strips.
However, it was also shown that different vegetation types have different point cloud profiles and reflectance characteristics.It was assumed that the echoes themselves did not contain enough information for point-based classification without full waveform recording such as Wagner et al. [78,79].Therefore, a raster approach was selected: a number of parameters were calculated in grid cells from the neighborhood of each point and these rasters were used as the input values for the classification algorithm.

Input Parameters and Calculations
The ALS data were processed using modules of the scientific laser scanning software OPALS [77].Depending on the nature of each variable, raster sizes were selected to average across several ALS points or to map their parameters to a high resolution raster.Parameters used for classification were the following: Surface reflectance: (Figure 3(a-c)) As described in Section 2.5, the echo amplitude is influenced by the atmospheric attenuation, the range, the incidence angle and the area and reflectance of the footprint.In the case of the laser scanning system applied here, the recorded signal was additionally dependent on an automatic gain control which amplified the received signal strength in order to keep it within an 8 bit range.These effects were corrected by the OpalsRadioCal module [61].This tool corrected each echo amplitude value for the above mentioned influencing factors by determining a sensor specific calibration constant derived from ground truth calibration targets with in situ measured reflectance (cf.Section 3.3.2).
Applying the calibration constant to the ALS amplitude data yields calibrated reflectance values for each echo as a dimensionless number between 0 and 1 [80].These reflectance values were rasterized to a 1 m grid to conserve each reflectance observation from the 1 pt/m 2 data as far as possible.Dropout point count: (Figure 4(a,b)) While healthy reed stands have well-defined and usually straight boundaries between the vegetation and the water, stands affected by die-back diminish towards open water in a series of clumps, islands or "peninsulas", creating a complicated boundary shape.In order to find a simple method to quantitatively locate these areas, the shape of the reed edge and the position of open water leads and lagoons had to be assessed.From specularly reflecting surfaces, hardly any light reaches the sensor system, so open water is shown by missing points called dropouts [63].This is especially true in reed stands or near the boundary where the vegetation creates a wind shadow and thus the water surface is very flat.Since the sensor has a continuous sinusoidal scan pattern, any dropout points caused by the presence of water are expected to be somewhere along the line joining the preceding and following points.For the purpose of this study, only the presence of missing points (and not their exact number or location) was used to outline water, creating one point marking the gap of any size in each scan line.
Since each point has a recorded GPS time, missing points could be detected by a Matlab (Mathworks, Natick, MA, USA) script wherever the GPS time difference between two echoes was above a threshold derived from the pulse repetition rate.To create a set of points representing the missing echoes of the water surface, the coordinates of the points preceding and following each dropout were averaged, so that the new point was created in the midpoint between them.If one edge of the scanned strip was above water, this created a row of estimated dropout points along the water boundary instead of on the area of the open water.For the size of gaps typical for die-back reed (1-5 m according to field experience), this simple algorithm created a row of interpolated dropout points along the center of the gap and parallel to the flight direction.These points were not written into the original point cloud, but a separate raster with 5 × 5 m cell size was created containing the number of such dropout points in each cell.The threshold of 3 was applied based on signature analysis (see Section 3.3.5)that proved to recognize areas where the reed boundary was not straight or where gaps and islands were present.NDSM (Normalized Digital Surface Model) height: the canopy height of some mapped vegetation categories is characteristic.A surface model representing the top of the canopy was created selecting the highest ALS points in the cells of a 2.5 × 2.5 m grid (Figure 5) and rasterizing.The raster cell size selected was a trade-off between averaging over several scanned points and retaining high resolution for mapping.A basic terrain model was calculated by rasterizing the elevation of the lowest point in each cell of a 10 × 10 m grid.Visual investigation has shown that in most cases, such a large cell size is sufficient to have at least one ground/water echo inside.The elevation difference of these two rasters was calculated to a 2.5 × 2.5 m raster to create a normalized digital surface model of the canopy height in m, bearing in mind the fact that moderate-density ALS-derived NDSM height typically underestimates the canopy height.Thus, absolute vegetation heights were not calculated, but this layer was found to sufficiently represent the existing canopy height characteristic for accurate classification.
Roughness: elevation inhomogeneities between ALS points and within areas of ALS-derived elevation rasters are a straightforward way to characterize vegetation vertical structure, which, in turn, is characteristic for genera and growth forms.For the purpose of this study, roughness was assessed at three different scales, providing three parameters that were considered independent (Figure 5).Grid variance: Healthy wetlands were observed to have a homogeneous stalk density and canopy height, while density inhomogeneities related to vegetation stress cause variations in the penetration of the signal and thus the vertical distribution of echoes.To represent this, a 1 × 1 m raster model was created using moving planes interpolation [81] and including all points not categorized as erroneous during pre-processing (Section 3.3.1).The variance of this surface (in m) within a 3 × 3 cell kernel was calculated to a 1 × 1 m grid and used as an input raster of the classification.
Sigma Z: The small-scale surface roughness of wetland vegetation was found to be characteristic: the range of the vertical distribution of points is narrower than in trees and shrubs but wider than over mown lawns or artificial areas.During moving planes interpolation, an inclined plane was fitted to the 8 nearest points within a 2.5 m distance from the central point.The standard deviation of the vertical distances of each point to the fitted plane is called Sigma Z and describes the roughness of the surface sampled by the ALS points (Figure 5).
Sigma Z is assigned to each ALS point, and to preserve boundaries and small-scale patterns (but not overlooking the fact that this was calculated over a 2.5 m radius for each point), this parameter was also rasterized to a 1-m grid.

Signature Analysis
The ground truth polygons were rasterized with the ArcGis 9.3 (ESRI, Redlands, CA, USA) Rasterize tool to 1-m grid size and grouped according to the variables measured in the field.These included the monodominance and dominance of each studied macrophyte genus, different ranges of vegetation height and different categories of reed health.For five of the ALS-derived parameters used (Sigma Z, NDSM height, surface variance, reflectance and dropout point count), histograms were generated for each group.These histograms were compared visually (Figure 6), so that similar and dissimilar groups were identified manually for each measured parameter, and threshold values for separating classes were selected.Signature analysis was not performed for DTM variance as it was assumed that the formation of wetlands is incompatible with sloping terrain over large areas.

Classification Algorithms Applied for Wetland Masking and Classification (Table 1)
Based on the results of the signature analysis, a manually derived rule-based decision tree algorithm was created in the batch script controlling the OPALS software.The basis of the procedure was a raster layer, and in each step of the decision algorithm, the criteria of one of the classes were tested on the given pixel.If they were fulfilled, the pixel was assigned to that particular category, and this was not changed in the subsequent steps.If not, the algorithm moved on to testing criteria of the next category (Table 1).The order of classification steps was selected to begin with the better-separable classes and end with those represented by less clearly constrained ranges of the input parameters.In order to exploit the high spatial accuracy of the data and the strong separability of non-wetland classes, but produce acceptable accuracy in the more subtle wetland sub-categories, two output files were generated.The one for discrimination between wetland vegetation and non-wetland areas had a cell size of 1 m, and the other for wetland vegetation categorization had a cell size of 2.5 × 2.5 m.A 1 × 1 m grid size was selected in order to map wetland boundaries with a resolution suitable for future change detection studies.The most strictly defined category was Scirpus, defined by minimum and maximum values of Sigma Z, reflectance and NDSM height.These thresholds were connected by an "AND" operator, so the pixel was only assigned to this category if all the criteria were simultaneously fulfilled.The next category in the ruleset was trees, selected on the basis of a minimum NDSM height and a minimum Sigma Z, with an "OR" logical operator, so all pixels were classified as trees where one of the criteria was met.
Water/artificial areas were classified using a Sigma Z minimum, a NDSM maximum, a DTM variance maximum or an upper and lower reflectance threshold.If any of the selection criteria were fulfilled, the pixel was characterized as water/artificial.
Finally, pixels classified as wetlands were those where the Sigma Z values were within the thresholds selected for flat areas and trees, and the NDSM height and DTM variance were lower than the values used for tree or artificial area selection.
After setting the output extents to the data extents and the cell size to 2.5 × 2.5 m in the algorithm, the first class selected was Carex, which was defined by an upper NDSM height threshold, a minimum and maximum reflectance and a maximum grid variance.The next category in the sequence was dieback reed, constrained by the reflectance range of reed and a minimum number of dropout points representing gaps in the vegetation.Next, Typha was classified on the basis of a maximum reflectance and a maximum grid variance, followed by ruderal reed, which was outlined by a minimum reflectance and a maximum grid variance.Stressed reed was mapped on the basis of a minimum grid variance (the same value used as a maximum for the previous categories).
Finally, the previously unclassified pixels were all categorized as healthy reed (if they were outside the grid variance and reflectance margins of stressed and ruderal reed), ensuring that no unclassified pixels were left.
The order of classification steps was refined by a manual iterative quality control process, but the threshold values were strictly kept as the signature analysis recommended.

Validation
To check against an independent standard in addition to our own ground truth points, we used a set of georeferenced ground photographs collected during the summer of 2010 by the National Water Authority for a different purpose (field-based wetland vegetation categorization).In the first step of quality control, 17 test strips were selected that overlapped with the original ground control areas and were relatively well distributed over the wetland areas.In order to have a similar number of ground truths used for each category but keep the spatial coverage of the reference dataset, the number of images belonging to each class was assessed and a final set of 775 reference images selected.This was done by keeping every 7th image for die-back reed, every 3rd for healthy reed, every 2nd for stressed reed and water/artificial, and by keeping all of the reference images found for the rest of the categories.
This resulted in approximately 100 reference images for major categories and about 50 for the less frequent classes.Ground reference photographs collected within the area of these 17 strips were assigned one or several of the dominant vegetation categories (see Section 3.2) according to visual interpretation of the images.In the rare case that the vegetation was not clearly recognizable on the photo, that particular image was discarded.The ground points where the photos were taken were overlaid on a GIS, the images visualized and their alignment and range estimated from visible shore objects.After this, the area estimated to be covered by the image was inspected on the ALS-based classified vegetation map (Figure 7).For close range photos, this could mean a patch of a few square meters, for longer range images, this would mean 10-20 m of shore.The vegetation categories assigned to the photographs were compared with the vegetation categories shown by the classified map for the same spot, and thus a confusion matrix [82] was created (Table 2).

Visual Quality Control Results
Quality control of the point cloud revealed that in most cases, echoes from water were lost due to specular reflection and that the canopies were not fully penetrated by the pulses.Multiple echoes were only recorded in trees, as the wetland vegetation canopy height rarely exceeded the minimum vertical distance needed for separation of subsequent echoes of a single pulse.Visual quality control of the resulting maps (Figure 7) revealed that the wetland recognition does not miss any known wetland areas but also classifies agricultural fields as wetlands in some cases.Since the main objective of the process was not to locate wetlands in the landscape, these were not removed manually.
Flat areas and water were well mapped including small lagoons within wetland areas but artificial elevated structures such as piers, platforms and moored boats under a certain size were classified as wetland.Trees and shrubs were accurately categorized but in some rare cases, waves were misclassified as Scirpus or Carex, especially in the centre of the strips where water reflections were the most intensive (Figure 7).The patterns of the wetland vegetation classification showed striking similarity to the patterns of vegetation patches observed in the field or aerial photographs.The overlapping areas of strips showed high similarity except for some edge effects.2, 3) For the class "wetlands" (Table 3), the user's accuracy was 97.1% and the producer's accuracy 100%, indicating that all ground truth points that belonged to wetlands were successfully categorized and only very few non-wetland points were categorized as wetland, mainly open water patches.Therefore, the area of wetlands was only very slightly overestimated.
Trees and shrubs were recognized with a user's accuracy of 100% and a producer's accuracy of 98.0% (Table 2).Classifying low trees or shrubs as reed was the only source of omission errors and no commission errors were present.No other vegetation was classified as tree and therefore, the area of trees was minimally underestimated.
The Scirpus class has a user's accuracy of 97.3% and a producer's accuracy of 92.3%, but since this is a relatively rare vegetation type in the study area, only 39 ground truth points could be used (Table 2).
Omission errors are caused by classification as die-back reed, healthy reed or water, while the single commission error was a misclassification of open water (waves).The area of Scirpus is slightly underestimated.
Water/artificial has a user's accuracy of 99.0% and a producer's accuracy of 88.9% (Table 2).The only commission error was a Scirpus area recognized as water, but several omission errors were found, mainly because this category also contained artificial structures such as boats and platforms.In the cases when these were not correctly recognized, they were mainly misclassified as die-back reed because they were close to dropout points (open water).Occasionally this class is also falsely determined as Carex due to the relatively low height.Since omission errors were more frequent than commission errors, the area occupied by water/artificial surfaces was somewhat underestimated.
User's accuracy for Typha was 72.9%, while producer's accuracy was 88.6% (Table 2).The number of ground truth points found was slightly lower than the 100 used for most categories.Dark colored reed patches were a source for commission errors, which were also caused by young (bright and low) and dense Typha growth categorized as Carex.Omission errors were nearly all caused by Typha patches and gaps of open water being interpreted by the algorithm as die-back reed.Typha was underestimated as shown by the relatively low user's accuracy.
The performance of the Carex class could only be tested in 48 control points.User's accuracy was 82.9%, while producer's accuracy was only 60.4%, the lowest of all classes (Table 2).While commission errors were relatively rare, they strongly affected the calculated accuracy because of the low number of points, with the most important problem being the categorization of artificial structures as Carex.Omission errors were more frequent, as some of the test points were from exceptionally high Carex stands (as revealed by field notes) and classified as reed, while others probably had a low density and thus lower reflectance and were classified as Typha.This also meant that the presence of Carex in the sites used for ground truthing was strongly underestimated by this classification scheme.If the two non-reed wetland vegetation classes were merged, misclassifications between them were not counted as errors (Table 3), and this single class would have a user's accuracy of 80.1% and a producer's accuracy of 84.6%.The category reed (including all four reed health categories) had a user's accuracy of 91.6% and a producer's accuracy of 94.0% (Table 3).The main source of commission errors were patchy Typha areas classified as die-back reed, and very tall Carex stands categorized as healthy or ruderal reed.The main source of omission errors was dark colored (low-density) reed being categorized as Typha.However, since the producer's and user's accuracy was relatively close, the overall area occupied by reed was only marginally overestimated.
Healthy reed had a user's accuracy of 80.2%, and a producer's accuracy of 80.7% (Table 2).Both commission and omission errors were mainly caused by misidentification of die-back reed, and sometimes created by misclassification of stressed reed.The small difference in accuracies suggested that the overall area occupied by healthy reed was correctly measured.
Die-back reed was classified with a user's accuracy of 62.5% and a producer's accuracy of 76.5% (Table 2).Commission errors were abundant as water reflections caused by waves could influence the relatively simple dropout counting algorithm.In these cases, the boundary of wetland vegetation was categorized as die-back reed.Artificial gaps in vegetation or artificial structures creating a patchy pattern were often misclassified as die-back reed.Omission errors were mostly die-back reed being categorized as healthy reed.The current algorithm strongly overestimated the presence of die-back reed.
Stressed reed had a user's accuracy of 80.4% with a producer's accuracy of 72.9% (Table 2).The main source of errors was the uncertain separation of die-back and stressed reed, while some omission errors were a result of misclassification as Typha.High grid variance introduced by the vicinity of trees also introduced some artifacts: in these cases healthy reed was misclassified as stressed.The relatively lower producer's accuracy compared to user's accuracy shows that the area of stressed reed was overestimated.If die-back and stressed reed were merged to a single category, the corresponding user's and producer's accuracy was 80.7% and 85.4% respectively.This implied that the recognition of reed stress itself was relatively reliable, while quantification of the level of damage was uncertain.
Ruderal reed was recognized with an 84.6% user's and 78.6% producer's accuracy, although the number of ground truth points was relatively low (Table 2).Commission errors were mainly caused by false identification as Carex, while omission errors usually involved misclassification as healthy reed.This category was slightly underestimated.
The total accuracy of the survey over the full area of the tested strips was 82.7%, with a Cohen's Kappa of 0.80.This indicated that although errors and artifacts were still present, the general reliability of the classification process was good, with a strong correspondence between ground truths and the classified map.

Discussion of the Survey Flight
The planning and setup of the airborne survey of Lake Balaton had to deal with a number of trade-offs due to limited flight time and ground resources.The main problem was the decision between full coverage of the study area with a limited ALS ground sampling density or, alternatively, only partial coverage with high ALS point density.In order to achieve full coverage of the lake with at least the typical point density of regional-level surveys (1 pt/m 2 ), the flight pattern was set to have only single-line coverage of most of the shore areas (Figure 1).Due to this flight pattern, instead of the typical 50-20% overlap between neighboring (parallel) ALS strips, in most cases the surveyed ALS strips overlapped with their neighbors at the ends only (2-10% overlap, see Figure 1).This compromised the absolute positioning accuracy of the data in the vertical sense, creating errors in the range of 10-50 cm between strips.Although the classical parallel block pattern can seem redundant for lake shore surveys as the majority of the surveyed area is open water, it has the advantage that overlaps between strips can facilitate accurate relative and absolute georeferencing [83].The sinusoid scan pattern of the Leica sensor resulted in slightly uneven point densities (<1%) across track and this sometimes created artifacts in the surface variance and Sigma Z layers near the strip edges.
The ALS sensor used works with an Automatic Gain Control that modulates the recorded echo intensity to keep it within the range of data representation (8 bits).Due to the fact that the exact gain function was unknown, Lehner et al. [80] tested a linear and an exponential gain function based on homogeneous reflecting areas.The exponential function was found to characterize the gain function better and was therefore utilized within the calibration procedure.However, in some regions the calibration method could not reproduce the real surface reflectances due to the time necessary for the gradual adjustment of the gain control voltage.
As the aircraft moved along the shore, most straight strips started or ended with the full width of the strip occupied by open water.In the cases where this was the start, the gain control was initially adjusted to the low reflectance of water, and as the plane moved in over the coast, the gain control took about one second of time to reach the values necessary for recording the higher intensity dry land echoes.The first scan lines covering the shore were thus collected using gain control values which were too high; therefore the echo amplitude oversaturated the receiver.The radiometrically calibrated reflectances of these echoes were erroneously low (since they were normalized by the high gain control) and did not represent the real reflection properties of the surfaces.In these areas, the vegetation was typically (incorrectly) classified as non-wetland for 100-200 m along the shore (reflectances lower than the minimum of the "wetland" category), and classified as Typha for the next few hundred m approximately, as this was the darkest vegetation category.Since the algorithm was applied to each ALS strip separately, and the ends of the flown strips always overlapped along the shore (Figure 1), this did not have a strong influence on the resulting vegetation map.
Surface reflectance reference values have not been measured with an active reflectometer [61,84] but with a passive spectroradiometer.Despite these difficulties, the reflectance range of the reference areas was sufficient for empirical estimation of the gain control function, which could then be included in the radiometric calibration scheme [80].

Parameter Calculation and Algorithm
The radiometric correction algorithm proved to work adequately for the echo amplitudes and gain control values usual for wetland vegetation.The resulting reflectance values were successfully used for accurate separation of artificial and vegetated surfaces as well as wetland categories.The water surface recognition algorithm was simple and robust but also introduced errors in areas where waves created ALS echoes from open water.As a result of this, the number of dropout points did not only depend on the vegetation pattern but also on the presence of waves, so if a single threshold value was applied to all flown ALS strips, die-back reed was overestimated on datasets with waves and underestimated on datasets without waves.
One of the advantages of the classification algorithm presented here is that the decision steps leading to the categorization of pixels are relatively simple to build and understand based on the signature analysis, and can all be directly explained on the basis of the vegetation structure observed in the field.Artifacts often follow the scan pattern and are thus easy to find during image interpretation, while advanced multivariate image classification methods (especially neural networks) can sometimes be a "black box" where errors are difficult to resolve [56].Contrary to the automatic selection of discriminatory parameters and values of multivariate methods, the signature analysis performed here is clear, relatively robust and easy to change.Compared to remote sensing methods that rely mainly on spectral parameters, this ALS-based algorithm relies on geophysical quantities such as NDSM height, elevation variance and surface reflectance (including dropouts) and is thus theoretically independent from many of the mission parameters (flying height, swath width, etc.).The thresholds and parameters were set based on a detailed understanding of the structure of the different vegetation classes and the representation of the structure in the point cloud.Some threshold values could need adjusting if a completely different point density or footprint size is used, or if the scale of vegetation units is different.It can therefore be expected that the proposed method including ground truthing, signature analysis and decision tree algorithms can be relatively easily adapted to other campaigns.

Selection of Categories
The categories were selected before the airborne survey during ground truthing in order to provide ecologically relevant information at the highest detail possible without the need for species-level identification in the field.The selected categories provide a strong representation of the health status of shore wetland vegetation.In case of reed, the most dominant species, which is also vulnerable to rapid die-back, this allows the quantitative mapping of vegetation health in an ecophysiological sense, with four categories describing the status.In case of the other categories and the habitat in general, the zonation and patch structure of the wetlands can be visualized and used to represent ecosystem health.
The question whether the information in the ALS point dataset can be mapped to a sufficient number of independent variables for classifying to a relatively high number of vegetation genus and health categories is highly relevant to this study.Identifying a high number of categories numerically representing vegetation structural parameters (such as height, density or biomass) would probably have been feasible but was not in the focus of this study and was also avoided intentionally due to the limitations of the dataset (e.g., low canopy penetration).The moderate density, discrete echo dataset and the simple decision tree algorithm performed adequately for most categories but, as expected, reducing the number of categories by merging increases overall accuracy values.If Carex and Typha are merged to "non-reed wetland" and stressed reed and die-back reed are merged to "unhealthy reed", the overall classification accuracy is 86.6% with a Cohen's Kappa of 0.85, and these categories are still relevant for conservation ecology.These accuracy levels are probably more suitable for detailed quantitative investigations but the information which would be lost through merging is valuable for the purpose of change detection, monitoring and ecological mapping.

Accuracy of Classification Categories
The classification of non-wetland categories and wetlands themselves was more accurate than expected with values mostly well above 90% (Table 2).Actually, the overall accuracy produced and the reliability of the different classification categories can be compared to the repeatability of some completely field-based vegetation mapping surveys [85,86], which is also regularly between 80-90%.Although the algorithm still identifies some agricultural areas as wetlands (including certain cereal fields and vineyards), it can be used on its own to locate wetland patches in the rather complex landscape of the study area if some data on the location of agricultural areas are accessible.This can help identification of small wetland pockets near the main lake that can have high conservation value as ecological corridors or sanctuaries [87,88].Vineyards, harbor piers, boats and platforms classified as wetland vegetation are a limitation of the point density.Several algorithms have been tested to remove row crops or boats with masts, but these were not found reliable enough and would have introduced their own artifacts.The problem of the waves on the lake surface being classified as wetland vegetation could have been eliminated by setting a minimum elevation threshold that corresponds to the water level of the lake.However, this was not attempted because of the problems of absolute georeferencing in the vertical sense (see Section 3.3.3),and also because some wetlands in the surrounding landscape have lower elevation than the water level of the lake.
Typha and Carex are recognized by this basic decision tree algorithm with moderate numerical reliability in case of the ground control dataset used (60-88%).The pattern of Typha and Carex patches identified still suggests that major typical stands are correctly recognized, but exceptions to the simple rules applied here cause errors.The main problem source, artificial structures misclassified as Carex, can compromise direct quantitative use, but these can usually easily be recognized when viewed as a map and removed or labeled manually if necessary.In some cases, the misclassification of tall Carex stands as reed, or low density reed stands as Typha has some ecological relevance, since similarity between the vertical structure of these confused vegetation types also means that they form similar habitats.
The presence of reed, the macrophyte with the highest conservation value, is recognized by the algorithm with considerable accuracy (>90%) (Table 2).The identification of reed is balanced in terms of under-or over-estimation and reed is rarely mistaken for other macrophytes.This implies that ALS time series can be used for automatic detection of subtle changes in the area occupied by reed, such as those caused by water level fluctuations [24].
The identification of healthy reed areas is also balanced and relatively accurate.Classifying healthy reed areas in the last step of the decision tree algorithm ("all areas previously unclassified", Section 3.3.6)was expected to be a source of error, since this means many exceptions to rules are classified as healthy reed.Since healthy reed is the most abundant vegetation category on the lake, it was assumed that this class would be the least compromised by such criteria, and results show that the accuracy of this class is satisfactory.
Since the process of reed die-back is well known to affect the stand structure of reed while the effect of reed stress on ecophysiological properties (and thus radiometric properties) at the stand level is not fully understood [7], ALS holds a strong potential for monitoring reed health.Although numerical quality control showed that reed die-back is overestimated by the set of parameters used, these artifacts are often easy to locate during interpretation as only a narrow strip on the immediate border to water is labeled as die-back.Artificial objects and structures classified as die-back reed, such as boats or fishing platforms, can be more difficult to recognize, but since their area is relatively small, they can be assumed to have a negligible contribution to the overall area occupied by die-back reed.
The identification of stressed reed (as defined in Section 3.2) was more accurate than die-back reed, as the parameters used here were less influenced by artifacts.
Finally, ruderal reed could also be identified with adequate accuracy even though it is a rare category and separation from other reed types is often not evident in the field.Misidentification as Carex is explained by the field observations that vertical growth of reed stalks is often impaired by competition of other species present (creating low vegetation heights similar to Carex), and Carex also tends to form mixed stands with ruderal reed.
Selecting larger cell sizes might also have enhanced the classification quality, but since change detection is one of the planned applications, spatial resolution had to be kept within the expected range of short-term changes of vegetation boundaries.The hybrid approach of using 2.5 m cells for wetland classification and masking this with a 1-m raster of non-wetland categories allows accurate detection of changes in wetland area without creating redundantly high resolution data from low resolution input rasters.

Discussion of Quality Control Method
The usual method for quality control is cross validation: separating the ground truth polygon dataset, reserving some pixels for automatic validation and using the rest for calibration [30,56,89].In our case, a different approach was chosen because the relatively low spatial resolution (2.5 × 2.5 m) of the output wetland classification raster restricted the number of pixels available from the 10 × 10 m area of the ground truth polygons.Since the number of pixels used for algorithm calibration is a crucial factor of the classification accuracy achievable [90], all of these were used for signature analysis in case of this study.Although field-based information can be slightly less objective, the ground image-based validation provided widespread spatial coverage and a full range of vegetation types.However, since quality control was not automatically performed, errors of the operator cannot be completely excluded.

Multispectral and Hyperspectral Surveys
The classification accuracy reached in this study is clearly lower than the maximum possible accuracy achievable by multivariate processing of hyperspectral data, but many wetland hyperspectral surveys have produced similar or lower accuracy and Cohen's Kappa values [13,29,89].The overall accuracy of 99.2% has been reached during classification of ROSIS imaging spectrometer data to five vegetation categories in a saltmarsh [26].Other studies of hyperspectral classification of wetland vegetation show accuracies from 90% (Kappa 0.87) for six classes [30] and 91% (Kappa 0.87) for five classes [89], to 78% (Kappa 0.63) for six main dominant littoral vegetation genera [29] and 78.3% (Kappa 0.72) for five wetland vegetation growth-form classes mapped by 15 selected hyperspectral bands [13].

Combined ALS-Multispectral Surveys
The accuracy of the current survey is also within the range of some multispectral-ALS fusion studies: the dendrogram-based classification of fused Compact Airborne Spectrographic Imager and ALS data yielded an overall accuracy of 74% (Kappa 0.66) for six classes [91].Object-oriented classification of ALS data fused with QuickBird satellite images identified six main land cover categories in a riparian savannah setting with an overall accuracy of 85.6% [59] (Kappa not published).By fusing a simple ALS-derived canopy height mask with selected bands of a hyperspectral dataset, detection of reed stands could reach an accuracy of 94% for two classes (Kappa not published) [57].In case of forests, the fusion of ALS with aerial photographs or multispectral images also produced accuracies above 95% [92].Identifying 3 natural and 5 artificial classes in an insect habitat mapping context was possible with an overall accuracy of 89,2% (Kappa 0.88) by fusion of multispectral images with a LIDAR-derived canopy height model [93].

ALS-Based Vegetation Surveys
An ALS-based study aiming at detecting a single invasive genus (Spartina sp.) found that the accuracy that could be reached by vegetation filtering of the point dataset was sufficient for identification of stand expansion and sediment accumulation [50].In forests, the accuracy of vegetation categorization based on ALS can be similar to this study: separating coniferous and deciduous trees is possible on the basis of full-waveform data with 85% overall accuracy [94], while the identification of the three main deciduous tree genera had an accuracy of 64% [65] (Kappa values not published).
A comparable study of vegetation mapping in an intertidal environment based on dual-wavelength ALS [51] provided a remarkable overall accuracy of 91.89% (Kappa 0.91), based on vegetation cover estimation using the reflectance ratio of the two bands in addition to vegetation structure.
These studies indicate that hyperspectral surveys, combined hyperspectral-ALS classification or dual wavelength ALS can sometimes provide better accuracies than single-band ALS, both in forests and wetlands.It is not to be overlooked, however, that most comparable studies use less categories for classification.Single-band ALS-based classifications have similar reliability in forests and wetlands, which is explained by the fact that homogeneous patches in wetlands can have horizontal extents similar to or larger than trees, and thus the units of classification are in fact often larger compared to the ALS point density in wetlands.
In addition to classification accuracy, many other factors have to be considered during survey planning: constraints on weather, flight/ground truthing time or funding can mean that in some cases, classification of single-band ALS data can be the most efficient method.

Applicability of the New Method for Regional and Local Scale Wetland Vegetation Mapping
The fact that ALS surveying is often carried out in preparation for major construction projects means that the method presented can potentially be applied in rapid impact assessment involving wetland areas, providing information not only on the area but also on the condition of the wetlands that are to be affected.The potential of ALS data for accurate topography mapping and change detection was also recognized by several national and state governments in the last decades [95], leading to full surveys of regions or states [96,97].These datasets are widely used for forest monitoring, natural hazard recognition, infrastructure planning, geomorphologic investigations and even change detection in the regions that have now been surveyed several times [98].ALS data are thus already available for large areas including many of the largest European lakes (Austria, Switzerland, Southern Germany, Southern Sweden and Norway, Southern Finland, the Netherlands, major Hungarian lakes).Since these regions have been surveyed with ALS settings similar to those used for this study, it can be expected that wetland maps based on the regional datasets would also have similar accuracy.Even if the created maps would not necessarily correspond to the immediate present, they would support the creation of wide-scale wetland inventories such as [99] and hold valuable information for change monitoring, habitat assessment and theoretical ecology.
Remote sensing scientists are familiar with the trade-off between measuring locally with high accuracy (usually by field sampling), or surveying the whole study area with lower accuracy (by remote sensing).The suitability of readily-available national laser scanning data puts this question into a new perspective: is it better to have potentially near-perfect vegetation classification from a dedicated survey with a specialized instrument, or is it better to use regional ALS data for full coverage of the study area even if the classification might be less reliable?
The EU Water Framework Directive requires EU member states to regularly monitor the health of all aquatic ecosystems, including shore wetlands.While hyperspectral imaging can certainly provide the necessary data, it is currently unrealistic for most member states to perform regular hyperspectral campaigns on all wetlands.Airborne multispectral imaging surveys fused with ALS are also proven to be able to provide the data necessary, but ALS as a standalone tool can produce similar or better accuracies with the proposed method.

Conclusions
Based on the expert-based decision tree classification of an airborne laser scanning survey of a major wetland system, it has been demonstrated that ALS is a suitable tool on its own for mapping reed wetland vegetation to the level of the dominant genus and reed health classes, with an overall accuracy of 82.71% and a Cohen's Kappa of 0.802.A ruleset-based decision tree algorithm was created that categorizes ALS-derived rasters into nine vegetation categories: three non-wetland classes, the three dominant emergent macrophyte genera, and four classes of reed health.The connection between the differences in vegetation structure observed in the field and the corresponding ALS-derived parameters can be established in a straightforward way.While ALS-based classification can be less accurate than the best hyperspectral, fused ALS-multispectral or multi-band ALS wetland surveys, enhancing the information content by dropout modeling and radiometric calibration has produced accuracies comparable to many such studies and similar to the repeatability that can be achieved by terrestrial vegetation mapping.
Extending the research presented should incorporate more rigorous methods of water surface detection [63,67] and explore the complementary information from hyperspectral imaging, which was simultaneously acquired.Full waveform ALS recording or sampling with a higher point density might supply the missing information for separating the currently less accurate categories.It can also be expected that more complicated multivariate classification using the parameters as pseudo-bands could probably reveal important structural patterns, and maybe lead to better identification of the current categories.

Figure 1 .
Figure 1.Surveyed ALS flight strips around Lake Balaton and Kis-Balaton.Inset shows location of Lake Balaton inside Hungary.

Figure 2 .
Figure 2. Typical ALS profiles of main classification categories.Vertical labels show ellipsoidal height in meters.Points included in the profile are within a strip of 15 m width and about 120 m length.Point brightness corresponds to ALS echo amplitude: bright points have higher amplitudes than dark points.

Figure 3 .
Figure 3. (a) Uncalibrated ALS echo amplitude of the area used for radiometric calibration.Range 0 (black)-255 (white).Polygons outlined in red are areas where reference spectra were collected.Note alternating bright and dark scan lines caused by differing levels of gain values of the scan lines.(b) Gain control values.Range 152 (black)-170 (white).Note abrupt change in gain control due to the low reflectance of water in the top (void) area of the image, and alternating high and low levels of gain control values of alternating scan lines caused by the presence of a low reflectance surface (water).(c) Calibrated surface reflectance.Range 0 (black)-1 (white).Note that the linear feature visible on Figure 3(a) caused by a major change in gain control level has been corrected as well as the alternating bright and dark scan lines.

Figure 4 .
Figure 4. (a) Planar view of the ALS point set in a die-back reed area.Open water patches within reed create dropout points.Image extent about 5 × 25 m.(b) Dropout interpolation generates a set of points within the void areas created by specular reflection from water.Points shown in red are created by the dropout modeling algorithm in the mid-point between the preceding and following echo on the scan line.

Figure 5 .
Figure 5. 3D vegetation structure parameters used for ALS vegetation mapping.Cross section view of ALS point cloud and grids interpolated for vegetation classification.Note different scales of surface roughness that correspond to input parameters for classification.

Figure 6 .
Figure 6.Example of signature analysis, showing calibrated ALS reflectances of monodominant Carex, Typha and Reed areas.Carex and Typha can apparently well be separated from each other based on reflectance, but not from Reed.

Figure 7 .
Figure 7. Example of vegetation map, showing identified open water, tree and artificial shore areas, and the location and zonation of a wetland.Dark blue line shows the shore, the lake is on the southern side of the line.Typical vegetation zones can be observed: Carex nearest to the shore, Typha in the interior of the stand, and reed on the outside with some die-back immediately adjacent to the open water.

Table 1 .
Parameters used by decision tree algorithm for ALS-based vegetation categorization.

Table 2 .
Confusion matrix of vegetation categories and accuracies.

Table 3 .
Classification accuracies of summed vegetation categories.