Canopy Gap Mapping from Airborne Laser Scanning: An Assessment of the Positional and Geometrical Accuracy

Canopy gaps are small-scale openings in forest canopies which offer suitable micro-climatic conditions for tree regeneration. Field mapping of gaps is complex and time-consuming. Several studies have used Canopy Height Models (CHM) derived from airborne laser scanning (ALS) to delineate gaps but limited accuracy assessment has been carried out, especially regarding the gap geometry. In this study, we investigate three mapping methods based on raster layers produced from ALS leaf-off and leaf-on datasets: thresholding, per-pixel and per-object supervised classifications with Random Forest. In addition to the CHM, other metrics related to the canopy porosity are tested. The gap detection is good, with a global accuracy up to 82% and consumer’s accuracy often exceeding 90%. The Geometric Accuracy (GAc) was analyzed with the gap area, main orientation, gap shape-complexity index and a quantitative assessment index of the matching with reference gaps polygons. The GAc assessment shows difficulties in identifying a method which properly delineates gaps. The performance of CHM-based thresholding was exceeded by that of other methods, especially thresholding of canopy porosity rasters and the per-pixel supervised classification. Beyond assessing the methods performance, we argue the critical need for future ALS-based gap studies to consider the geometric accuracy of results. Remote Sens. 2015, 7 11268


Introduction
Defined as small-scale openings in forest canopy, gaps are created by management activities (thinning) or natural disturbances, such as windthrow [1] or natural mortality.These openings alter micro-climatic conditions, increasing plant diversity and allowing regeneration by increasing light levels [2,3].Because of their high regeneration potential, gaps are important for foresters seeking to promote a nature-based silviculture, by emulating natural disturbances.Spatial characteristics (size, shape, distribution, orientation) of gaps in forests are of central importance in understanding regeneration, ecosystem dynamics, species diversification and distribution [4][5][6][7][8][9].Koukoulas and Blackburn [10] highlighted the importance of gaps with different spatial properties for seed establishment and determining the future forest structure.Lindenmayer et al. [11] proposed horizontal heterogeneity due to presence and distribution of gaps as a mechanism for conserving forest biodiversity by maintenance of stand structural complexity.Gap shape, orientation and size variation affect the gap dynamics, mainly through the variation in the light environment and moisture levels [11,12].In his study, Getzin et al. [13] used gap shape metrics derived from Unmanned Aerial Vehicles (UAV) images to assess the plant diversity in forests.
Reliable survey methods are an important prerequisite for the analysis of spatial patterns of gaps [5].However, mapping and characterising canopy gaps is a complex issue.Delineation of gaps in the field is difficult, time-consuming and there is a lack of consensus about canopy gap definition and methods for describing regeneration [3,8,14,15].Besides, field survey is not always reliable because of the difficulty of the ocular evaluation of the exact limits of the gap [16].
A commonly used definition of a gap was introduced by Brokaw [17] who defined it by the canopy drip-line, i.e., the vertical projection of the external borders of the tree crown with an height greater than 2 m.This definition is pragmatic in the field but can lead to an underestimation of the area affected by the gap disturbance.Thereafter, Runkle [7] introduced the concept of expanded gap which extends the gap to the base of the trunks.But, this concept is not easily adapted for remote sensing purposes as the position of trunks is hard to identify from above [18].The vegetation height is a common criterion to define the closing of a canopy gap.As Airborne Laser Scanning (ALS) has become a common technique to assess forest height, the use of Canopy Height Models (CHM) derived from ALS to delineate gaps has been explored in a number of studies [5,8,16,19].

Canopy Gap Mapping with ALS : A Background
Applying a fixed height threshold on an ALS CHM is the most common method of canopy gap delineation, based on the definition of Brokaw [17].In most cases, studies differ by the value of the threshold and a constraint of minimal area.Asner et al. [19] and Boyd et al. [20] quantified gap-size frequency distribution in Southern Peruvian Amazonia by mapping gaps with a height threshold of, respectively, 1 to 20 m and 2 m with a minimum area of 2 m 2 .Vepakomma et al. [16] and Vehmas et al. [8] chose a fixed 5 m height with a minimal gap size of 5 m 2 .Vepakomma et al. [16] focused on gap dynamics, with two co-registered CHM (1998 and2003), whilst Vehmas et al. [8] identified understory vegetation types.Kane et al. [5] studied patch dynamics, stand structure and spatial patterns in Pacific Northwest forests by analysing the spatial distribution of gaps, delineated as 3 m maximum height and larger than 9 m 2 .In 2004, Koukoulas and Blackburn [21] post-processed delineated gaps with GIS functions (shrinking) to isolate gaps connected by corridors and to refine gap shape and boundaries.
A second method of canopy gap mapping is the use of a relative height threshold.Gaps are then defined depending on the neighbourhood stand height.This approach requires typically, the creation of a canopy top raster.The gaps are identified by comparison of the original CHM with the canopy top raster.In mangrove forests, Zhang [22] used opening and closing operations on a CHM to produce a surface representing the top of the canopy.Gap grid cells were detected after a black top-hat transformation using a relative height threshold (0.65).According to the author, this approach gave better results than the fixed height method.However, no complete accuracy assessment was performed against field reference data.Instead, only a visual comparison of the gap boundaries was carried out, using a slope raster derived from the CHM and based on the assumption that the slopes along gap boundaries are steep.Such slope rasters were also previoulsy used in Koukoulas and Blackburn [21] for helping to identify the fixed height threshold to apply.In the study of Gaulton and Malthus [18], a canopy top raster was produced from a ALS CHM by applying a moving window and setting the maximum value to the grid cell center.Then, CHM grid cells with heights inferior to 66% of the corresponding height of the canopy top raster were classified as gap.Betts et al. [14] use a similar approach in Nothofagus spp.forest in New Zealand with a DSM produced from aerial photographs.In this case, gaps were delineated as areas showing a difference of 12 m between the two rasters.
To improve gap extraction, Gaulton and Malthus [18] mapped gaps directly from the point cloud by initially identifying the canopy area.Their processing method was initiated by the identification of local maxima corresponding to tree tops.Then, the returns with a relative height of less than 66% of canopy height were removed to retain only canopy returns, which were then clustered to a local maximum.The clustered canopy returns were merged and delineated to retrieve gap extent.This method avoids interpolation to a CHM and makes use of more complete height information from the return points.After comparison with the raster method, the authors demonstrated a small increase of accuracy compared to CHM methods, especially for low density ALS data, but the point cloud method is much more computationally expensive.
This latter approach favored a delineation of tree crowns and considers that gaps are found "by default".Intrinsically, gaps and crowns are complementary as their distribution defines the canopy structure.ALS sensors are able to capture canopy structure wall-to-wall and are used for the estimation of canopy structural variables as Vertical Canopy Cover (VCC), Angular Canopy Closure (ACC) or Leaf-Area index (LAI) [23][24][25].For example, the VCC is the total crown projection area (defined by the outermost perimeter of the crown on the horizontal plane) divided by the stand area.In this case, small gaps inside the crown are considered to be a part of the crown.VCC and ACC are related to the penetration of light through the canopy and are considered as indirect measurement of light regimes [26].

Aims of the Study
ALS makes detection and analysis of canopy gaps easier over large areas [27].But further work is needed to assess the potential of ALS for accurate gap mapping, especially regarding retrieval of gap area and shape, with the importance of these characteristics outlined previously.
In this context, we analyse the potential of ALS for canopy gap mapping in uneven-aged broadleaved stands with a focus on geometric accuracy of gap retrieval.For the purposes of this research, three mapping methods based on raster datasets produced from high density leaf-off and leaf-on ALS data are tested: Thresholding, supervised classification of individual grid cells ("per-pixel" classification) and per-object supervised classification.
In this study, priority is given to the raster-based approach, more commonly used and more suitable for use over extensive areas.Point-based delineation is not tested here due to the computational costs and as it produces only relatively small accuracy increases.Unlike in previous studies, we tested the use of ALS rasters other than CHM.We analyse if alternative metrics could improve gap delineation, especially in terms of retrieval of gap shape and borders.These rasters aim to capture the three-dimensional structure of points cloud and to describe the canopy porosity.For the thresholding method, several threshold values were tested.For the per-pixel and per-object approaches, we used supervised classification by Random Forest [28].The maps produced are compared in terms of detection quality (through a confusion matrix) and geometric quality (through a comparison with gaps mapped in the field).
The influence of the season is also considered by the use of both leaf-off and leaf-on ALS datasets.Such a comparison of the effectiveness of several mapping methods on both leaf-on and leaf-off ALS data sets, is novel for canopy gap delineation.Besides, no previous studies have attempted to validate gap geometry reliability.Furthermore, we investigate a new approach by which to classify forest stands according to their canopy opening degree prior to gap mapping, to provide a means to identify the optimal gap delineation method.

Study Site
The study area (≈ 18, 000 hectares, Figure 1) was located in the watershed of the Houille river near the village of Gedinne (Wallonia, Belgium).Forests, which cover approximately 60% of the area, are composed of uneven-aged broadleaved stands with oak (Quercus petraea (Matt.)Liebl.and Quercus robur L.) and European beech (Fagus sylvatica L.) and even-aged coniferous stands dominated by Spruce (Picea abies (L.) H. Karst.) and Douglas-fir stands (Pseudotsuga menziesii (Mirb.)Franco).Four compartments (management units of the Forest Service) were selected by a field visit and by photo-interpretation according to two main criteria: predominantly broadleaved stands and exhibiting varying degrees of canopy closure (Figure 1).Forty-two stands compose the four compartments of the study area.The limits of the stands were available from the Forest Service but the quality of the borders were checked by photo-interpretation with ortho-images and corrected with ArcGIS 10.1.Tree regeneration is present in the selected stands with a variable cover (ranging from a few seedling stems to large seedling cover).The regenerating species are oak, European beech and birch (Betula pendula Roth).The dominant herbaceous species are Pteridium aquilinum (L.) Kuhn, Rubus fruticosus L., Rubus idaeus L., Deschampsia flexuosa (L.) Trin, Luzula luzuloides (Lam.)Dandy & Willm and Vaccinium myrtillus L.

Canopy Gap Definition
For the purposes of this study, a clear and simple definition was necessary to identify gaps in the field.A definition similar to Brokaw [17] was used.Throughout the study, we defined gaps as openings in the canopy with a minimum area of 50 m 2 , a minimum width of 2 m and a maximum vegetation height of 3 m.The vegetation level value of 3 m is considered to indicate well-established regeneration [29] and is a critical height for the survival of regeneration, especially against ungulate predation.Most previous studies showed that ALS is able to detect very small size gaps (around 5 to 10 m 2 ).We consider gaps as priority areas for promoting natural regeneration, especially for oak which has survival difficulties below beech [30].Bruciamacchie et al. [31] concluded that the minimum opening area for oak regeneration is 70 m 2 and the optimal is between 100 to 120 m 2 .Moreover, small gaps (<50 m 2 ) close very fast due to the crown growth of surrounding trees [32].For this reason a minimum gap size of 50 m 2 is selected.

Field Data
Two field campaigns were conducted in August 2012 (in leaf-on conditions).First, a systematic grid (50 m × 50 m) was created covering each of the four study sites.Each point was visually interpreted in the field as "gap" or "not gap" areas (with a distinction of forest tracks and new gaps made by recent exploitation).A total of 1140 points were visited (295 of which fell in gaps).These data were used for the assessment of the gap detection quality.The corrected position of the theorical center of each 50 m × 50 m cell was established using a dGPS SXBlue II during the field visit.
Secondly, 39 canopy gaps (55 m 2 to 2400 m 2 ) were mapped in the field with a dGPS, electronic compass and laser rangefinder, with post-processing of the GPS data to obtain a planimetric error of, on average, 20 cm.The gap boundaries were delimited by the projection of the external borders of the crowns.Several points were recorded along the gap where the gap boundary was judged to significantly change its orientation [18].These reference gaps were used to assess the geometric quality of the delineated gaps.

ALS Data
The ALS datasets were captured in March (leaf-off) and July (leaf-on) 2011 by a Riegl LMS-Q680 (300 KHz), with a maximum scan angle of 15 • .First and last returns were recorded and classified by the data provider as water, building, high vegetation (returns with a height >1.5 m), ground and unclassified, according to the LAS format standard.Intensity information was also available.The point clouds were organised by 500 m × 500 m tiles.The average number of returns per 1 m 2 grid cell is 40 (average range between 1 and 238) and the average number of first returns is 21 per m 2 (average range between 1 and 190).The overall planimetric accuracy of the ALS dataset is 25 cm.

Pre-Processing of ALS Data
ALS data pre-processing was carried out using Las2Pix, a software developed by the Unit of Forest and Nature Management (University of Liège-Gembloux Agro-Bio Tech).This software is based on LASTools [33] and allows the production of raster metrics from the pre-classified ALS point cloud according to a number of different statistical functions and constraints and at a given spatial resolution.Each grid cell is assigned a value calculated from points falling inside the cell (e.g., maximum height for single returns only).Raster metrics were produced for each tile and thereafter, these were merged into one raster per compartment, at 50 cm resolution.This spatial resolution was chosen as a good compromise between the point density and the precision of the gap mapping (considering the minimal gap size of 50 m 2 ).
The CHM was generated as the height of the highest return in each grid cell.The Canopy Porosity Index (CPI) is the percentage of points below 3 m-high.This metric is inversely proportional to the amount of leaves, branches and stems, and considered as a measure of the degree of opening of the canopy (Figure 2).This is the complementary of the Canopy Closure as defined in Kane et al. [5].As with the CPI, the Ground Point Ratio (GPR) is proportional to the porosity of the canopy and is calculated as the percentage of points classified as ground in each grid cell.The CPI and GPR are calculated on the basis of all returns.The Height-Scaled Crown Openness Index (HSCOI) was developed by Lee and Lucas [34] and is based on the voxel concept, calculated within a moving window.This index provides a quantitative measure of the relative penetration of ALS returns into the canopy.Three other metrics were produced describing the canopy height: the standard deviation of height (Std-H), the coefficient of variation of height (CV-H) and the slope of the CHM (Slope-H).Zhang [22] used this latter metric to assess the quality of automatically delineated gaps boundaries, as many gap boundaries are well captured by rings of pixels with large slope values (Figure 2).Slope-H is tested here as a potential approach to refine gap boundaries.Finally, an intensity metric was generated as the mean intensity of single returns, to characterize the spectral response of the high vegetation (Mean-I-S).

Gap Mapping Methods
The three methods implemented in this study for gap detection (thresholding, per-pixel supervised classification and per-object supervised classification) are described in the following sections (Figure 3).

Thresholding
LiDAR canopy gap mapping is basically to discriminate canopy areas and non-canopy areas (corresponding to low vegetation or ground areas).In the simple thresholding approach, a fixed threshold value was applied to a single raster layer, either CHM, CPI or GPR to delineate gap areas.Three different metrics to identify non-canopy pixels were selected : using a fixed height threshold on a CHM, using a proportion of points classified as ground (GPR) or using a proportion of points below a fixed height threshold, considering presence of low vegetation and not only ground points (CPI).Several threshold values were tested for these three rasters and the result of the simple thresholding approach was a binary raster output (canopy or gap): • CHM: gaps are identified as grid cells with a height below 3 m or 5 m; • CPI and GPR: gaps are identified as grid cells with a CPI/GPR above a threshold.A specific step to determine the optimum threshold values for CPI and GPR (respectively for leaf-off and leaf-on) was implemented to avoid a systematic test of several values for each datasets.This threshold selection is presented in Section 3.2.1.
Secondly, multiple thresholding was implemented by adding three thresholded binary raster layers (CHM, CPI and Slope-H).With this particular combination of metrics, we aimed to improve gap delineation from the most commonly used CHM thresholding approach by the complementary use of other LiDAR information, especially in refining the boundaries using slope information and reducing the impact of small areas of high understorey or regeneration, using the CPI.The slope and CPI thresholds were set after a trial-error visual assessment and two combinations were tested: The binary rasters Slope-H-75% and Slope-H-60% participate in identifying gaps areas by detecting the boundaries of gaps as grid cells with a slope above 75% and 60% respectively (slope of the CHM is usually steep along the gap border).The results of the two combinations had grid cell values between 0 and 3 and thus were reclassified into binary rasters (value 0 as canopy and values 1, 2 or 3 as gap).

Per-Pixel Supervised Classification
For the supervised classification process, 100,000 training points were selected (equally distributed in gap or non-gap classes) for leaf-off and leaf-on datasets.These reference points were chosen based on field data and photo-interpretation, and were spread across the four study sites.Corresponding grid cell values were extracted for four rasters metrics (CHM, CPI, Std-H, I-S) for each reference point.We used the package randomForest [35] in R software [36] to construct a Random Forest (rF).The rF algorithm was applied with ntree = 1000 and mtry = 4. Variable importance was calculated to analyse which metrics were the most discriminatory.The out-of-bag error (OOB error) was derived as a robust way to assess the global accuracy of the final classification by the rF algorithm.After this training step, the constructed Rf was applied to the full extent of the four sites.
Random Forests [28] are a very efficient and popular algorithm for classification [37].The OOB error is used to estimate the prediction error and to assess the importance of each variable (VarImp) which allows sorting of predictors.There are two objectives regarding variable selection by the VarImp measure: to find variables related to the response for interpretation purposes and to find a small number of variables that are sufficient for a good prediction.Calculation of the OOB is based on a bootstrap sample of the data [35,37].

Per-Object Supervised Classification
A segmentation was conducted in eCognition [38,39], based on two raster metrics (CHM and CPI) for both leaf-off and leaf-on datasets.The Multiresolution Segmentation Algorithm [40] was used to extract small homogeneous areas, typically segments of tree crowns or gaps.The scale parameter was set to 10 for a shape/color ratio of 0.5 and a compactness/smoothness ratio of 0.5.These objects were exported from eCognition in polygon shapefile format, with the following object attributes: mean of the CHM, mean CPI, mean I-S, ratio of mean object I-S to total scene average I-S (Ratio to Scene Intensity, RtS Int.), calculated as the mean intensity of the object divided by the mean intensity of the entire scene), standard deviation of the CHM, standard deviation of the CPI and standard deviation of the I-S.The four compartment shapefiles were merged in ArcGIS.We selected training objects from among the eCognition segments for the four study sites: 530 gap objects and 530 canopy objects for the leaf-off and leaf-on data sets.We applied the rF algorithm with ntree = 1000 and mtry = 4 and calculated the variable importance to analyse which metrics were most discriminatory.

Morphological Filtering
A post-processing stage was then implemented (with ArcGIS tools) to obtain the best possible match between the ALS mapping approaches and the definition of a gap stated above.Thin corridors and small spaces between trees (width inferior to 2 m) or holes within crowns, were eliminated by using the Shrink tool (an erode function) which reduces the gap areas by 1 m around the classified gap area by replacing gap pixels with the most frequently occurring value in the local neighborhood.This step is followed by the application of the Expand tool (a dilate function) which enlarges the classified raster gaps by 1 m (to return remaining gap areas to the initial size before shrinking).We reduced the pixelization and eliminated remaining trees within gaps by applying the Focal Statistics tool with a circular neighborhood of 1.5 m radius and retaining the most frequently occurring value (majority filter).Following this initial post-processing, the binary rasters were converted into polygon shapefiles and forest roads were removed manually with the Erase tool.Finally, gap polygons with an area <50 m 2 were removed (Select).These filters were applied consistently for all gap mapping methods.

Gap Detection
For the CPI and GPR thresholding, a preliminary step is applied to select an optimum threshold for both leaf-off and leaf-on datasets.For this purpose, Receiver Operating Characteristic (ROC) curves are produced, by means of the ROCR package [41] in R. ROC curves are used to visualise classifier performance, to compare and evaluate prediction models, by changing the threshold value used for the detection to computing, for each case, the True Positive Rate (TPR) and the False Positive Rate (FPR).As a trade-off is present between the TPR and the FPR, ROC curves help identify the optimum threshold value.To apply this technique, we used the same training data as for the training step of Random Forests (3.1.2).
In order to compare the gap detection efficiency of the thresholding and supervised classification methods, a quality assessment was performed using a confusion matrix (or error matrix) to derive global/overall, producer and consumer/user accuracies for the gap class [42].CA measures errors of commission specific to the user's map.The producer's accuracy (PA) measures errors of omission.As reference data, we used the 295 gap points of the systematic 50 m × 50 m grid and 295 randomly selected points from among the 845 canopy (non-gap) points.

Influence of Stand Type on Gap Detection
As ALS metrics can be used to describe the canopy cover, a range of metrics were derived to characterize the opening degree of the forest stands and identify "forest types" in our study area.Each stand was described by nine variables derived from five ALS metrics: mean and standard deviation of the HSCOI, mean and standard deviation of the CPI, mean and standard deviation of the CHM, mean of CV-H, the percentage of the stand area with a height of < 3 m (P-3m, based on the CHM) and the percentage of stand area with CPI ≥ 75% (P-CPI75).To simplify comparison, this analysis was conducted with leaf-off data only.Before the analysis, a negative buffer (5 m) was applied to limit the border effect.The hierarchical clustering function hclust in the R software [36] was used to group the stands, using the euclidean distance and the Ward method.
Subsequently, the quality of the gap mapping was examined in relation to the forest type, to analyze if it is relevant to choose a gap mapping method based on stand openness characteristics.The confusion matrices of all mapping methods for the leaf-off dataset were analysed and compared for each cluster using PA, CA and the Kappa Index (instead of GA).Kappa Index (KI) is a standard statistical measure of the difference between observed agreement and the chance agreement between reference and classified data [42].KI is of relevance when the number of reference points for the different classes is not identical, as is the case within the forest types.

Gaps Geometry
The 39 field mapped gaps were used to evaluate the geometric quality of the ALS delineated gaps.ALS gap polygons were selected which intersected the reference gaps.To assess the geometric accuracy, the reference gaps were compared to the ALS delineated gaps using four geometric indicators.The area, the shape complexity and the Index D describe the shape and size of the gap, which influences its functioning.Finally, the main direction of the gap is the global orientation and is considered as a descriptor of the light environment.The following measures were used: GSCI is a measure of the relative complexity of shape; it compares the shape of the gap to a circle.It is computed as the ratio of a gap's perimeter to the perimeter of a circular gap of the same area.An increasing value of GSCI is an increasing shape complexity [13,21].• Area of the gap ( m 2 ) • Main direction (degrees): This is the azimuth of the longest line within the boundaries of the polygon without crossing edges.This line is created by the geom.polygonfetchcommand of Geospatial Modelling Environment.The azimuth value ranges between 0 and 180 degrees.• Index D: derived from Oversegmentation and Undersegmentation (Equations ( 2) to ( 4)).
This index is used in several studies to assess the accuracy of object-based image segmentation [43,44], to determine best segmentation parameters.Index D is a distance which varies between 0-1 and is a quantitative assessment of the goodness of polygon matching.Index D has to be minimized and a good balance between oversegmentation and undersegmentation has to be found to optimize the results.In this analysis, polygons for which the ratio of the intersected area between ALS and field gaps was a minimum 10% were retained (compared to 50% in Möller et al. [43] and Clinton et al. [44]).

Results
In Section 4.1, the gap detection results are presented, while Section 4.2 focuses on the influence of the stand type on gap mapping.Finally, Section 4.3 is dedicated to the assessment of the geometric accuracy.

Optimum Threshold Selection for CPI and GPR
To select an optimum threshold for CPI and GPR data, ROC curves were used with a training dataset.Two types of graphics are analysed : changes in the TPR and FPR according to the threshold values (Figure 4) and changes in the global accuracy according the threshold value (Figure 5).Based on the ROC curves, the following values for CPI and GPR were selected : CPI-60 for the leaf-off dataset, CPI-50 for the leaf-on dataset, GPR-60 for the leaf-off dataset.These thresholds were the only CPI and GPR simple thresholding methods retained in the subsequent analysis.The optimum CPI and GPR thresholds in leaf-off conditions are the same, but GPR performance is systematically lower.CPI and GPR global accuracy curves (Figure 5) are very similar for the leaf-off datasets and for CPI leaf-on for thresholds >50%.Leaf-off accuracies are superior in general to leaf-on, which can be explained by the difference in vegetation density.Decreasing the GPR threshold does not really improve the quality of gap detection.This tendency seems to highlight the importance of the low vegetation influence on the results.Based on the low gap detection accuracy, GPR thresholding is not considered in further analysis for leaf-on data sets.

Summary of the Confusion Matrices
The results of the gap detection accuracy assessment for the nine selected methods, for the two seasons, are summarized in Table 1.These results allow a number of comparisons to be made: leaf-on vs. leaf-off data, thresholding vs. supervised classification, per-object vs. per-pixel, CHM vs. other ALS metrics.Based on global accuracy, the three best methods are: per-pixel classification in leaf-on and leaf-off conditions, CPI-60 for leaf-off data and combination1 multiple thresholding in leaf-on conditions.CHM-5m thresholding also produces high global accuracy in leaf-off conditions, but has a significantly lower producer's accuracy, indicating larger omission errors.The specific results of supervised classifications are presented in Table 2 and Figure 6     Global accuracies are, in general, greater for leaf-off conditions.Leaf-off producer's accuracies lie between 52% and 94% against 19% and 86% for leaf-on conditions.Leaf-off producer's accuracies are around 10% greater than leaf-on for all methods (Table 1).In consequence, omission errors are lower in winter conditions.The difference between leaf-off and leaf-on data is lower for the consumer's accuracy.The commission errors are small (7% to 22%, excluding Combination2 multiple thresholding) meaning that areas identified as gaps are reliable, but are generally underestimated from leaf-on data.
For the per-pixel and per-object supervised classifications (Table 2), the accuracy results (OOB, global, producer and consumer accuracies) are similar for leaf-off and leaf-on data.The VarImp parameter (Figure 6) highlights Mean-I-S and CPI as important for the leaf-on dataset when using per-pixel classification, whereas the CHM is of relatively greater importance for leaf-off data.This contrast is lower for the leaf-off dataset.No such differences in the importance of intensity are observed for per-object classification between the two seasons.Gap detection by CHM thresholding shows an average producer's accuracy but consumer's accuracy is very high, more so than for per-pixel or per-object classification.On average, the discrepancy between consumer's and producer's accuracies is greater for thresholding-based methods, but global accuracy remains relatively similar and good for all methods, except for GPR leaf-on thresholding.Based on global accuracy, CPI-60 is the best performing method, but high and similar accuracy is obtained from per-pixel classification.

Per-Object vs. Per-Pixel
The OOB accuracies for per-pixel and per-object classifications are given in Table 2; the values are similar and very high.The importance of variables involved in the supervised classification are shown in Figure 6.Owing to the definition of objects (group of several grid cells), more variables can be utilised in the per-object classification process, as the mean and the standard deviation of the grid cell values can be calculated.The most important variable for leaf-off and leaf-on per-object classification is the mean of the CHM, followed by the mean CPI.The variable importance analysis for per-pixel classification identifies a different order of the variable importance.The Mean-I-S is the most discriminating variable for the per-pixel classification in both leaf-off and leaf-on data sets.4.1.6.CHM Thresholding vs. other Metrics CHM thresholding did not produce the best accuracies in this comparison, particularly in terms of the omission error (CHM-3m had a low producer's accuracy).Use of a higher threshold for the CHM (5 m) improves the gap detection, with an increase of 6% to 10% for producer's accuracy.The GPR has poor performance compared to CPI and CHM metrics due to a high omission rate.CPI-60 in leaf-off conditions provides the highest global accuracy and the best balance between omission and commission errors.
Table 3 presents the improvement in gap detection of the three best performing alternative methods over CHM thresholding (per-pixel classification of leaf-on data, CPI-60 thresholding of leaf-off data and Combination1 multiple thresholding of leaf-on data).The leaf-off results are compared with leaf-off and leaf-on with leaf-on for all methods.The consumer's accuracies vary less than producer's accuracies between the methods and are best for the CHM thresholding.Alternatives to CHM thresholding provide higher global accuracy, but these are relatively small improvements (1%-4% increase in global accuracy).The most important improvement is the decrease in omission error.

Influence of Stand Type on Gap Detection
The hierarchical clustering of the 42 stands produced a dendrogram from which three groups were extracted.Analysis of these clusters shows a separation according to the opening degree of the canopy.The following three stand types are identified: opened canopy with isolated trees only (cluster 1), closed canopy (cluster 2) and heterogeneous canopy (cluster 3).Two main variables amongst the nine are particularly useful to differentiate the three clusters: the P-3m and P-CPI75 (Figure 7).The two variables are highly correlated and thus only one is sufficient to discriminate the three groups.The choice may depend on the forest type, the recruitment characteristics or the ALS data.The P-3m is the easiest indicator to use as only a CHM is needed.
Figure 8 shows the Kappa index, consumer and producer accuracies for each method for each of the three stand types.The Kappa index shows the greatest differences between the methods.The discrepancies in accuracy between the methods varies according to the stand type: commission errors are more stable between methods, but are higher for open canopy (consumer accuracies were greater than 80% in general for heterogeneous and closed canopies), whilst omission errors are more variable between methods for closed canopy.Heterogeneous canopies have the highest Kappa index for most methods.Regarding Kappa index, the best methods for each stand type are : combination1 multiple thresholding for closed and open canopies and CPI-60 thresholding for heterogeneous canopies.The analysis of the Kappa index (in this case, for leaf-off data only) shows that the best global methods may be different from the best methods for specific stand types.The three best leaf-off methods when considering all stands (Table 1) are per-pixel classification, CPI-60 CHM-5m thresholding.Amongst these three methods, per-pixel classification performs most consistently across all stand types, but CPI thresholding is best in heterogeneous and closed stands.CHM thresholding (5 m threshold) gives lower accuracies, with notably higher omission errors, especially in closed canopy.In such closed stand types, all these methods are significantly outperformed by Combination1 multiple thresholding, which performed less well for leaf-off data in the global case.

Gaps Geometry Accuracy
This section will provide a synthesis of the accuracy of the different methods in retrieving characteristics of gap geometry.Figure 9 illustrates four typical gaps with example delineations from the three methods giving the best detection results.A particularly poor result is observed in Figure 9g,k, where the ALS delineation resulted in large and very branched gaps, which encompassed several reference gaps (becoming connected by corridors).Such results appear in stands where openings in the canopy are small or medium and spread throughout the stand, forming a network of connected openings.The gaps delineated in this case are more complex in shape than isolated ones and may be less well fitted to the definition of a gap used in this study and to accurate field delineation.To overcome this issue, which may bias comparison of geometric accuracy, highly branched and interconnected gaps were therefore eliminated from the subsequent analysis of geometric accuracy by applying a selection based on the GSCI (GSCI< 5).
In the following sections, ALS gaps are compared to reference gaps based on the error, computed as the difference between the reference value and the ALS value; except for the case of Index D, which is intrinsically a comparison with reference data (Figure 10).

GSCI
Regarding GSCI, the weakest results are obtained for multiple thresholding : Combination 2 for leaf-off data produces no gaps with GSCI <5 and Combination1 has the largest mean error and high extreme values (for both leaf-off and leaf-on).The two CHM thresholding methods are quite similar with a tendency to overestimate shape-complexity in comparison to field data, as for all the methods.The best performance is from per-pixel classification of leaf-off data.

Gap Area
CPI thresholding and per-pixel classification perform well for both seasons.Multiple and GPR thresholding show the worst results, especially for the leaf-on dataset.The average errors are acceptable but, despite the GSCI-based filtering, there are still extreme values, mainly representing large overestimates of gap area (except for CHM thresholding in leaf-on conditions).

Main Direction
For both leaf-off and leaf-on datasets, and all methods, the average error in azimuth is close to 0, but there is large dispersion or errors with extreme error values (up to 150%).The best results are obtained from CPI-60 thresholding of leaf-off data and per-pixel classification of leaf-on data.The standard deviation of errors for the azimuth is lower for the leaf-off dataset, suggesting more consistent prediction of gap orientation.

Index D
A good match between reference and ALS delineated polygons corresponds to a low Index D and indicates a good balance between oversegmentation and undersegmentation (Figure 11).Index D is, in general, lower for leaf-off datasets, and CPI-60 thresholding from leaf-off data has the smallest Index D value with a good oversegmentation/undersegmentation ratio.In leaf-on conditions, per-pixel classification is the most effective method.Undersegmentation is higher in leaf-off conditions, which confirms the tendency to delineate excessively large gaps from leaf-off data.The multiple thresholding methods generally produced the poorest results for this index, although Combination1 performed better in leaf-on conditions.The CHM-3m and 5m thresholding show similar results for leaf-off and leaf-on datasets.).This situation is often associated with a small oversegmentation value because the reference area is often entirely covered by the ALS-derived gap.Small values of undersegmentation occur when a reference gap is covered by several small ALS gaps (c) or a single ALS gap but with a good coverage and a satisfactory global shape (a).(d) is an intermediate case.

Discussion
Three methods are highlighted as producing the best gap detection results: per-pixel supervised classification of leaf-on data, Combination1 multiple thresholding of leaf-on data and CPI-60% simple thresholding of leaf-off data.The global accuracies achieved using these best methods (81%-82%) compare favourably with those of past studies (e.g., 73.1% of field and ALS gap length matched along transects in Vepakomma et al. [16] or 77%-88% global accuracy in Gaulton and Malthus [18] using a computationally expensive point cloud based approach).The choice between methods implies trade-offs dependent on the study requirements, i.e., more reliable gap detection but with a risk of missing some gaps or more detected gaps with a risk of false detection.For example, multiple thresholding methods provide the highest producer's accuracy (greater than 80 % for leaf-off data, 90 % for leaf-on), but at the cost of higher commission errors.These methods also have variable performances regarding geometric accuracy.Independently of the gap detection accuracy, CPI-60 thresholding of leaf-off data and per-pixel classification of leaf-on data give acceptable results for geometric accuracy by most measures, unlike Combination1 multiple thresholding of leaf-on data which produces a poor final map in terms of GSCI and Index D.
Optimum thresholds for CPI and GPR were identified through ROC curves, but it is acknowledged that the thresholds used are derived empirically.Further research to examine their physical basis would be valuable.However, the identified thresholds appear reasonable, as within a correctly identified gap, substantial numbers of returns > 3 m in height (e.g., a CPI of up to 40%) may result from very sparse but high understorey or regeneration, small within-gap tree crowns or mixed canopy and gap cover within pixels along the gap boundary.
In general, better results were obtained from most methods in leaf-off conditions (with the exception of multiple thresholding and per-object classification) regarding gap detection and geometry.The high point densities of the ALS datasets used in this study allow good characterisation of tree crowns, even in leaf-off conditions, and the absence of leaves decreases occlusions and improves representation of the ground.In deciduous forest, leaf-off data may therefore be generally preferable to leaf-on data unless point density is low, but the transferability of this finding to other forest types and datasets needs further examination.GPR thresholding performed worse for leaf-on data than CPI thresholding, whilst CPI and GPR leaf-off results are more similar.These findings likely indicate an effect of the low understorey vegetation presence in summer on gap detection (few returns penetrating to the ground in leaf-on conditions even within gaps) with GPR-based methods.The effect of the season is also highlighted in the per-pixel classification.The variable importance shows a relatively greater importance of CHM for leaf-off data versus Mean-I-S and CPI for leaf-on data.This is probably due to the higher contrast in reflectance between elements in leaf-on conditions (e.g., ground vs. leaves).
Despite the relative technical simplicity of the method, thresholding is not clearly surpassed by supervised classification.Given the complexity of implementation and the similar performance compared to other simpler methods, classification methods may not always be the most efficient choice.Image segmentation used eCognition and was intentionally simple but the use of such software requires a learning process by the user and remains a black box approach.The simpler per-pixel approach also outperforms the per-object classifier based on global accuracies from the confusion matrix.The preparation of the training samples is time-consuming for both supervised classification methods.
The results of this study also suggest that CHM thresholding is outperformed by other methods (see Table 3).However, choice of threshsold is also important if using a CHM.An increase in the height threshold (3 m to 5 m) improves the detection.This is probably due to better representation of the gap border, with closer correspondence to the canopy drip line, but the results of the 3m threshold could also be influenced by high understorey vegetation.The CHM-based method also performs less consistently between stand types (high omission errors in closed canopies), although this may be improved by the use of relative rather than fixed height thresholds, as proposed by, for example, Zhang [22] and Gaulton and Malthus [18].Methods based on CPI thresholding from leaf-off data and per-pixel supervised classification based on ALS-derived raster layers are, according to our results, the most accurate and reliable, (i.e., using canopy porosity index, rather than a CHM, decreases the under-estimation of canopy openings).However, it should be noted that CHMs have other very useful applications in forestry and are valuable ALS products.The creation of a CHM has other advantages for a forest manager, such as allowing height estimation and a CHM is therefore more multi-purpose.The CHM is also likely to be less sensitive to ALS point density than the CPI.Moreover, CHMs can also be produced from a photogrammetric DSM by subtracting an existing DTM.Photogrammetry can be, in some cases, more cost-effective than ALS.The relatively small accuracy gains over the use of CHM thresholding may therefore not always justify the use of other metrics, such as CPI, for gap mapping.
The results of the geometric accuracy assessment indicate poor agreement with field data in many cases, especially in terms of gap shape.In this case, a complementary explanation could lie not only in the gap mapping method, but also in the field survey, during which challenges arise when mapping complex gaps or interconnected gap networks.It is likely that neither the field mapping or the ALS-derived gaps fully represent the true gap geometry.Whilst gaps have been field surveyed with a good degree of accuracy, subtle details of gap boundaries may be better represented by ALS, resulting in more complex shapes.To some extent this is to be expected, due to the high resolution of the ALS data relative to field mapping and some differences between reference and ALS gaps may therefore reflect this issue.Even though many errors were larger than would be accounted for by this factor, future studies are needed to investigate the possibility of using more advanced approaches, such as terrestrial laser scanning, to improve validation.The increasing use of unmanned aerial vehicles also provides a potential alternative method for mapping gaps [45].The relatively easy acquisition of very high-resolution images and the possibility for programming repeated acquisitions [46] should foster the development of gap dynamic studies.
Particular challenges were encountered in obtaining comparable ALS and field delineations, in terms of gap geometry, for the heterogeneous stand type.However, a more detailed analysis reveals that two subgroups composed this stand type, with the same global gap coverage but differing spatial distribution: one with larger open areas, more isolated and well defined and a second where openings are distributed more diffusely within the stands.Extensive networks of interconnected gaps are a source of error, due to challenges in both gap definition and the field data collection.The chosen definition of gaps in this study fits better with the concept of isolated gaps.When considering issues such as light availability for regeneration, alternative conceptual approaches to examining gap geometry, size distribution and spatial arrangement in heterogeneous or open stands are likely to be required, moving away from the concept of isolated gaps (which is appropriate in closed canopies) to an 'open matrix model' [47,48].ALS mapping of the distribution of gap space has potential to facilitate this over larger spatial extents.

Conclusions
Gap mapping is a topic of major importance for forest managers and ecologists.A reliable mapping of canopy gaps is the basis for numerous research and management applications (for example gap dynamics, gap distribution at the scale of the landscape, habitat modelling and silvicultural applications).This paper has investigated canopy gap mapping through four types of methods (simple thresholding, multiple thresholding, per-pixel supervised classification and per-object supervised classification) applied to leaf-off and leaf-on datasets (with an average density of 40 returns per m 2 ), taking into account the mapping accuracy in terms of both gap detection accuracy (590 validation points) and geometric quality (analyzed for 39 reference gaps mapped in the field).
Our findings demonstrate that mapping of canopy gaps with ALS is not trivial, especially regarding errors in gap areas (maximum error up to 2000 m 2 ) and geometry (e.g., mean Index D = 0.43 for Combination1).This study indicates that ALS can be used to map canopy gaps in uneven-aged broadleaved forests with gap detection accuracies of up to 82%.New methods, based on thresholding of CPI and per-pixel supervised classification of ALS-derived raster layers, show particular promise for obtaining accurate gap detection (82% for CPI-60 leaf-off and 81% for Per-pixel leaf-on) and geometric representation (GSCI mean error 16 • and 13 • for CPI-60 leaf-off and Per-pixel leaf-on respectively and main direction mean error -0.7 and -0.55).But taken together, the results suggest the importance of and difficulty in identifying an optimum mapping method for all circumstances.The choice of method is generally likely to be a trade-off between producer's and consumer's accuracy or between detection and geometric accuracy, e.g., GPR-60 Leaf-on global and consumer's accuracy equal to 76% and 89% but producer's accuracy equal to 52%.Beyond the performances of the specific methods presented here, we argue the critical need for future ALS-based gap studies to consider the geometric accuracy of their results, even if ALS remains the best option for gap mapping over large extents.
Further challenges arise regarding the selection of an appropriate definition of canopy gaps, compatible with remote sensing, and in achieving a meaningful comparison of ALS and field delineations.Despite the challenges, understanding the differences in gap geometries and gap size distributions obtained from traditional field survey methods and from ALS delineation is vital, if conclusions regarding gap dynamics and spatial patterns from studies utilising these different approaches (in different locations or at different time periods) are to be compared and understood.
The most appropriate approach to mapping of canopy gaps will depend on the forest type and the application.This implies the adaptation of methods, for example in the choice of threshold, of morphological filters and in the methodology for the field survey.Moreover, as the effect of the understory vegetation has important implications, future studies should examine the assessment and characterisation of low vegetation and regeneration with ALS data.The effects of ALS acquisition parameters such as scanning angle, point density and derived raster resolution were not assessed in this paper, but also need to be further explored [18,49].

Figure 1 .
Figure 1.Boundary of the watershed of the Houille river, near the Belgium/France border.The four sites (compartments numbered one to four) are scattered in the study area and their areas are respectively 113 hectares, 39 hectares, 49 hectares and 85 hectares.Several types of stands with different degrees of canopy closure are illustrated by a subset of an infrared false color aerial image acquired in 2009 (50 cm resolution).

Figure 2 .
Figure 2. Metrics are produced from the raw airborne laser scanning (ALS) data.The yellow polygon corresponds to a field mapped gap over an infrared aerial image (a); The Canopy Height Model (CHM, (b)), the slope of the CHM (Slope-H, (c)) and the Canopy Porosity Index (CPI, (d)) are illustrated.

Figure 3 .
Figure 3. Synthesis of the mapping methods comparison.Three methods (thresholding, per-pixel and per-object supervised classifications) are compared.Per-pixel and per-object classifications are implemented with the Random Forest algorithm in R. Quality assessment is by a confusion matrix (detection quality) and by comparison with field-mapped reference gaps (geometric accuracy).
and discussed further in Sections 4.1.4and 4.1.5.

Figure 4 .
Figure 4. Receiver Operating Characteristic (ROC) curves for Canopy Porosity Index (CPI) and Ground Point Ratio (GPR) performance, for thresholds between 0% and 100%.The dots correspond to threshold values between 0% to 100% by 10%-step.Below a certain value, a plateau of True Positive Rate (TPR) is observed with an increasing of False Positive Rate (FPR).

Figure 5 .
Figure 5.Comparison of the global accuracy obtained from CPI and GPR thresholding according to the threshold applied (0% to 100%)

Figure 6 .
Figure 6.The mean decrease accuracy is a way to measure the variable importance for (a) per-object and (b) per-pixel classifications.This indicator allows the sorting of the variables according to their discriminatory abilities.

Figure 7 .
Figure 7. Three clusters are identified from the hierarchical clustering, corresponding to three stand types: open canopy with isolated trees only (cluster 1), closed canopy (cluster 2) and heterogeneous canopy (cluster 3).These types are well defined by two indicators: the percent of stand area with height < 3 m and the percent stand area with CPI ≥ 75%.

Figure 8 .
Figure 8. Kappa Index, producer's and consumer's accuracies of the three stand types for each gap detection method using the leaf-off dataset.

Figure 9 .
Figure 9. Four examples of canopy gap mapping results (yellow polygons) are compared with the the reference gaps (black-yellow dashed polygons) overlain on a near-infrared composite (50 cm, year 2009): (a-d) are per-pixel classification of leaf-on data;(e-h) are CPI-60 thresholding of leaf-off data and (i-l) are Combination1 multiple thresholding of leaf-on data.These four cases are typical of results obtained in this study: a good correspondence between the field and ALS gap ((a) and (e)); a good coverage of the reference gap but in several parts ((b), (f), (j)), a weaker result resulting in a large and very branched airborne laser scanning (ALS) gap ((c), (i) and especially (g) and (k)) corresponding to several separate reference gaps, which is typical of sparse or heterogeneous canopy cover; and small gaps missed or partially missed in more closed canopy ((d), (h), (l)).

Figure 10 .
Figure 10.The geometric accuracy of the selected mapping methods is assessed by four indicators : gap-shape complexity index (a,b), area (c,d), main direction (e,f) and Index D (g,h).The error is computed as the difference between the reference value and the ALS-value.The results are summarized in boxplots for the two seasons (left for leaf-off and right for leaf-on).

Figure 11 .
Figure 11.Examples of oversegmentation and undersegmentation values for a sample of gaps.(a) large value for undersegmentation corresponds to an ALS gap overestimating the reference gap (e.g., very large and branched ALS gap, (b)).This situation is often associated with a small oversegmentation value because the reference area is often entirely covered by the ALS-derived gap.Small values of undersegmentation occur when a reference gap is covered by several small ALS gaps (c) or a single ALS gap but with a good coverage and a satisfactory global shape (a).(d) is an intermediate case.

Table 1 .
Accuracy assessment of gap detection (summarized from the confusion matrix) for the different methods in leaf-off and leaf-on conditions.GA is global accuracy, PA is producer's accuracy and CA is consumer's accuracy.

Table 2 .
The out-of-bag accuracy is an evaluation of the global accuracy of the final classification.The results are given for the per-object and the per-pixel supervised classifications for both leaf-off and leaf-on conditions.

Table 3 .
Amount of improvement over Canopy Height Model (CHM)-5m thresholding for the three best methods selected based on gap detection accuracy.The leaf-off results are compared with leaf-off CHM thresholding and leaf-on with leaf-on CHM thresholding.