Object-based Canopy Gap Segmentation and Classification: Quantifying the Pros and Cons of Integrating Optical and Lidar Data

Delineating canopy gaps and quantifying gap characteristics (e.g., size, shape, and dynamics) are essential for understanding regeneration dynamics and understory species diversity in structurally complex forests. Both high spatial resolution optical and light detection and ranging (LiDAR) remote sensing data have been used to identify canopy gaps through object-based image analysis, but few studies have quantified the pros and cons of integrating optical and LiDAR for image segmentation and classification. In this study, we investigate whether the synergistic use of optical and LiDAR data improves segmentation quality and classification accuracy. The segmentation results indicate that the LiDAR-based segmentation best delineates canopy gaps, compared to segmentation with optical data alone, and even the integration of optical and LiDAR data. In contrast, the synergistic use of two datasets provides higher classification accuracy than the independent use of optical or LiDAR (overall accuracy of 80.28% ˘ 6.16% vs. 68.54% ˘ 9.03% and 64.51% ˘ 11.32%, separately). High correlations between segmentation quality and object-based classification accuracy indicate that classification accuracy is largely dependent on segmentation quality in the selected experimental area. The outcome of this study provides valuable insights of the usefulness of data integration into segmentation and classification not only for canopy gap identification but also for many other object-based applications.


Introduction
A canopy gap is defined as a small opening within a continuous and relatively mature canopy, where trees are absent (i.e., non-forest gaps) or much smaller than their immediate neighbors (i.e., forest gaps) [1].Canopy gaps are usually formed from natural disturbances, such as individual tree mortality events caused by insect, disease [2], or silvicultural thinning or harvesting activities [3].Canopy gaps play an important role in forest regeneration, turnover, and overall dynamics of forest ecosystems [1].In northern hardwood forests, for example, the size of gaps plays a critical role in controlling the regeneration of tree species that are not tolerant of deep shade [4].Canopy gaps can also lead to the transformation of understory microenvironments (e.g., solar energy, water, and nutrients), responsible for understory biodiversity and habitats.In boreal mixedwood forests, Vepakomma et al. [5] found that canopy gaps increased availability of abiotic resources within canopy gaps as well as up to 30 m into the surrounding forest.
Compared to manual interpretation, which requires extensive field validation and training, remote sensing offers an efficient and accurate alternative for automated canopy gap identification.Medium spatial resolution Landsat images were able to detect relatively large forest canopy gaps [6,7] but failed to map fine-scale canopy gaps (i.e., under 30 m in size) [8].The emergence of high spatial resolution images accompanied with the prevalence of Object-Based Image Analysis (OBIA) have overcome this shortcoming.OBIA views a group of similar image pixels as a segment and is particularly useful for high spatial resolution image processing because geo-objects tend to occupy many pixels at a fine scale.Compared to the traditional pixel-based methods, OBIA reduces spectral variability within geo-objects and suppresses the "salt and pepper" noise in the classification map [9].For example, Jackson et al. [10] evaluated the potential of high spatial resolution IKONOS image (4 m) for identifying windthrown gaps, and found it could characterize more gaps than manual interpretation of temporally coincident aerial photographs.In addition, He et al. [11] successfully separated non-vegetated trails, roads, and cut blocks from vegetated areas using high spatial resolution SPOT image.Malahlela, Cho and Mutanga [12] tested the utility of WorldView-2 image with eight spectral bands to delineate forest canopy gaps, and concluded that it yielded a higher accuracy than conventional images with four spectral bands.These studies demonstrate that broad-band multispectral images produced promising results for canopy gap identification.However, the saturation of visible-near-infrared [1] signals makes it a challenge to discriminate between tree canopies and forest gaps [12].Narrow-band hyperspectral images could solve this problem but their prohibitive acquisition cost limits the exploitation of their full potential [13,14].Light detection and ranging (LiDAR) data have recently become one of the most important data sources for analyzing canopy gap dynamics.Vepakomma et al. [15] used a LiDAR derived Canopy Height Model (CHM) to identify canopy gaps larger than 5 m 2 in a conifer dominated forest.Gaulton and Malthus [16] compared CHM and point cloud based techniques, and observed that the latter resulted in a higher overall accuracy over CHM-based methods.Nevertheless, very few studies involving canopy gap delineation and classification have integrated passive multispectral images and active LiDAR data by taking advantage of both spectral and vertical information.
In our study, a synergistic use of optical and LiDAR data is adopted for canopy gap delineation and classification for a structurally complex forest, located in Haliburton Forest and Wildlife Reserve, Ontario, Canada.In the workflow of OBIA, canopy gap delineation refers to sketching the boundaries of canopy gaps, whereas object-based classification is adopted to categorize the delineated geo-objects into different types of canopy gaps including non-forest (2-6 m in height) and forest gaps (6-10 m in height).Canopy gap delineation is done through image segmentation, which is the prerequisite step for object-based classification.This study focuses on answering the following three research questions: (1) how does the synergistic use of optical and LiDAR data influence the quality of canopy gap segmentation; (2) what are the advantages of the synergistic use of optical and LiDAR data in the process of object-based canopy gap classification; and (3) to what extent can the quality of canopy gap segmentation affect the accuracy of object-based gap classification?

Study Area and Experimental Site Selection
Our study was carried out in Haliburton Forest and Wildlife Reserve, located in the Great Lakes-St.Lawrence region of central Ontario, Canada (Figure 1).The forest is approximately 30,000 ha in area and is primarily composed of uneven-aged, mixed-deciduous forest dominated by shade-tolerant hardwood species.At present, sugar maple (Acer saccharum) represents approximately 60% of basal area with American beech (Fagus grandifolia), yellow birch (Betula alleghaniensis), black cherry (Prunus serotina), balsam fir (Abies balsamea), and eastern hemlock (Tsuga canadensis) also present and relatively abundant [17].
Our study was carried out in Haliburton Forest and Wildlife Reserve, located in the Great Lakes-St.Lawrence region of central Ontario, Canada (Figure 1).The forest is approximately 30,000 ha in area and is primarily composed of uneven-aged, mixed-deciduous forest dominated by shade-tolerant hardwood species.At present, sugar maple (Acer saccharum) represents approximately 60% of basal area with American beech (Fagus grandifolia), yellow birch (Betula alleghaniensis), black cherry (Prunus serotina), balsam fir (Abies balsamea), and eastern hemlock (Tsuga canadensis) also present and relatively abundant [17].The experimental site (approximately 1000 hectares) was chosen for the study (Figure 1) because canopy gaps vary in type, size, and structure (water bodies included) in this site.

Multi-Source Remote Sensing Data
The optical multispectral image was acquired using the ADS40 airborne sensor by the Ontario Ministry of Natural Resources and Forestry (OMNRF) in the summer of 2007.The image contains four spectral bands (i.e., blue: 420-492 nm, green: 533-587 nm, red: 604-664 nm, near infrared (NIR): 833-920 nm) with a spatial resolution of 0.4 m.LiDAR data were collected by an Optech Airborne Laser Terrain Mapper (ALTM) 3100 LiDAR system in the summer of 2009.The flight was conducted at a height of 1500 m with a 16-degree field of view, a scan rate of 36 Hz, and a maximum pulse repetition frequency of 70 kHz.The average sampling point density is 1.7 points per square meter.In the process of creating the CHM, the LiDAR points within the study area were normalized to the terrain and outliers were filtered.The height of each pixel (cell size: 2 m) was derived by the maximum normalized LiDAR point that intersected the 2 m pixel (Haliburton & Nipissing LiDAR Survey).The multispectral image and CHM were clipped to cover the whole experimental site (Figure 2).As the CHM derived from the LiDAR data had a spatial resolution of 2.0 m, the multispectral image was further resampled to the spatial resolution of 2.0 m by the nearest neighbor method to keep the data compatible.
were filtered.The height of each pixel (cell size: 2 m) was derived by the maximum normalized LiDAR point that intersected the 2 m pixel (Haliburton & Nipissing LiDAR Survey).The multispectral image and CHM were clipped to cover the whole experimental site (Figure 2).As the CHM derived from the LiDAR data had a spatial resolution of 2.0 m, the multispectral image was further resampled to the spatial resolution of 2.0 m by the nearest neighbor method to keep the data compatible.

Methods
The processing steps for the automated canopy gap delineation and classification are shown in

Methods
The processing steps for the automated canopy gap delineation and classification are shown in Figure 3. Image segmentation was implemented to delineate canopy gaps at a range of scale parameters over three data sources: (1) the multispectral image; (2) the CHM; and (3) the combination of multispectral image and CHM.The suitable scale parameter that produced the best segmentation was identified and the best segmentation map from three data sources was then adopted for subsequent object-based classifications.Next, the geo-objects in the best segmentation were assigned with spectral (multispectral image), height (CHM), or both of spectral and height information for classification, and were further automatically categorized into three classes: non-forest gaps, forest gaps, and tree canopies.The classification accuracies were quantified by the parameters derived from the confusion matrices.The relation between the segmentation quality and the classification accuracy was investigated at the end.
Remote Sens. 2015, 7 6 Figure 3.The flowchart of the automated gap delineation and classification using optical and LiDAR data.

Canopy Gap Delineation
To segment canopy gaps, we employed a prevailing segmentation algorithm (i.e., multiresolution segmentation (MRS)) implemented in the Trimble eCognition Developer software package.MRS uses a local-oriented region merging technique that executes pairwise merging within a local vicinity [18].The segmentation process is controlled by three parameters-scale, shape, and compactness [19]-and the size of segments is primarily determined by the scale parameter.In general, a high scale parameter produces larger segments whereas a low scale parameter produces smaller segments.In our experiment, the scale parameter was adjusted to produce different segmentation results while the other two parameters (i.e., shape and compactness) were fixed as the default values (i.e., 0.1 and 0.5) because

Canopy Gap Delineation
To segment canopy gaps, we employed a prevailing segmentation algorithm (i.e., multiresolution segmentation (MRS)) implemented in the Trimble eCognition Developer software package.MRS uses a local-oriented region merging technique that executes pairwise merging within a local vicinity [18].The segmentation process is controlled by three parameters-scale, shape, and compactness [19]-and the size of segments is primarily determined by the scale parameter.In general, a high scale parameter produces larger segments whereas a low scale parameter produces smaller segments.In our experiment, the scale parameter was adjusted to produce different segmentation results while the other two parameters (i.e., shape and compactness) were fixed as the default values (i.e., 0.1 and 0.5) because canopy gaps varied in shape and compactness.When more than one data source was used in the MRS process, each layer was weighted equally.
Canopy gap delineation was achieved through image segmentation, so the accuracy of canopy gap delineation can be assessed by segmentation evaluation algorithms.Of many existing measures of segmentation evaluation (e.g., analytical and empirical goodness), the indicators of empirical discrepancy have been demonstrated to be most effective [20,21] because they capture the dissimilarity between a reference polygon and a corresponding segment.In recent years, many discrepancy measures have been proposed [22][23][24][25].Yang et al. [26] proposed the indicator of Modified Euclidean Distance 3 (ED3 Modified ) in order to measure local metrics of geometric and arithmetic discrepancy and globalize the ED3 Modified with equal weight to each reference polygon over the whole image.The ED3 Modified value ranges from 0 to 0.71 with a lower value indicating a better segmentation.Yang et al. [27] later developed a new discrepancy measure, Segmentation Evaluation Index (SEI), to quantify the segmentation accuracy from the perspective of geo-object recognition.The SEI indicator is a more strict discrepancy measure because it requires a one-to-one correspondence between reference polygons and candidate segments.Similar to the modified ED3, a lower value of SEI indicates a higher quality of segmentation, although SEI ranges from 0 to 1. Since SEI is a more strict indicator of over-segmentation than ED3 Modified [26,27] and since over-segmented geo-objects does not impact the object-based classification as negatively as under-segmented geo-objects, we decided to use ED3 Modified to quantify the accuracy of canopy gap delineation at scale parameters between 10 and 100.We manually digitized 29 reference polygons for non-forest and 53 reference for forest gaps (Figure 2).A total of 82 reference polygons were used to calculate the ED3 Modified values to determine the quality of canopy gap segmentation.For the segmentation results at those scale parameters, one-way analysis of variance (ANOVA) was implemented to identify whether the synergistic use of optical and LiDAR data could significantly improve the accuracies of canopy gap segmentation.The best segmentation result was determined by the lowest value of ED3 Modified , and was further used for the object-based canopy gap classification.We also calculated the SEI values for the above segmentation results in order to determine which index (i.e., ED3 Modified or SEI) was more related to object-based classification accuracy.

Object-Based Canopy Gap Classification
The segmented geo-objects were assigned with spectral, height, or both of spectral and height information (i.e., the mean pixel values within the segments) and then classified into three classes: non-forest gaps, forest gaps, and tree canopies using the support vector machine (SVM) available in the R package kernlab [28].The SVM classifier is used to implicitly map the original feature space into a space with a higher dimensionality, where classes can be modeled to be linearly separable [29].This transformation is performed by applying kernel functions (e.g., linear, polynomial, Radial Basis Function (RBF), and sigmoid) to the original data.The learning of the classifier is associated with a constrained optimization process which is also called a complex cost function [30].All the original data layers (i.e., multispectral, CHM, Multispectral + CMH) were imported into the SVM classifier.A set of samples containing 29 polygons (17,315 pixels) for non-forest gaps, 53 polygons (16,341 pixels) for forest gaps, and 17 polygons (16,973 pixels) for tree canopies was randomly selected as the training and test samples for object-based canopy gap classification (Figure 2).The SVM classifier with a kernel of RBF was employed for classification.Using one-way ANOVA, a 10-fold cross-validation was implemented for accuracy assessment to determine whether or not the integrated optical and LiDAR data could lead to a significant improvement in canopy gap classification in comparison with the independent data sources.
A confusion matrix was used to quantitatively evaluate classification accuracy.Accuracy parameters, derived from a confusion matrix, consists of Producer's Accuracy (PA), User's Accuracy (UA), and Overall Accuracy (OA).As recommended by Olofsson et al. [31], the post-stratified estimators of accuracy parameters have better precision than the estimators commonly used when test samples are selected randomly or systematically.We used the post-stratified Producer's Accuracy, User's Accuracy, and Overall Accuracy to estimate classification accuracy, and further investigated how the quality of canopy gap delineation affected the accuracy of object-based classification.

Results
For the segmentation of canopy gaps, the one-way ANOVA indicated that there were significant differences among the segmentations of multispectral image, CHM, and combined data.The Dunnett's T3 test indicated that ED3 Modified was significantly lower when using the CHM to segment canopy gaps (0.56 ˘0.09) than using the other two data sources over the set of scale parameters (p ď 0.05), indicating that the CHM produced the best segmentation results (Figure 4).There was no significant difference between the delineation results using the multispectral image (0.66 ˘0.03) and using an integration of the multispectral image and CHM (0.66 ˘0.03).
Remote Sens. 2015, 7 8 selected randomly or systematically.We used the post-stratified Producer's Accuracy, User's Accuracy, and Overall Accuracy to estimate classification accuracy, and further investigated how the quality of canopy gap delineation affected the accuracy of object-based classification.

Results
For the segmentation of canopy gaps, the one-way ANOVA indicated that there were significant differences among the segmentations of multispectral image, CHM, and combined data.The Dunnett's T3 test indicated that ED3Modified was significantly lower when using the CHM to segment canopy gaps (0.56 ± 0.09) than using the other two data sources over the set of scale parameters (p ≤ 0.05), indicating that the CHM produced the best segmentation results (Figure 4).There was no significant difference between the delineation results using the multispectral image (0.66 ± 0.03) and using an integration of the multispectral image and CHM (0.66 ± 0.03).The lowest ED3Modified scale parameter was 20 when segmenting the CHM, whereas the lowest ED3Modified segmentation of the other two sources were produced at the scale parameter of 10.For the non-forest gaps, such as the waterbody shown in Figure 5a,b, segmentation results from three datasets were generally acceptable.However, the waterbody segmented from the CHM (Figure 5d) extended beyond its boundary to the lakeshore and thus did not match the reference geo-object as well as those from the multispectral image (Figure 5c,e,f).This is to be expected because the spectral contrast of the waterbody and its neighboring lakeshore was much stronger than the height difference.Most of the forest gaps, as shown in Figure 6a,b, were well segmented by the CHM (Figure 6d) despite slight over-segmentation.The inclusion of the multispectral image did not improve segmentation but resulted in over-segmentation (Figure 6c,e,f), likely due to the spectral confusion between forest gaps and the neighboring tree canopies.The lowest ED3 Modified scale parameter was 20 when segmenting the CHM, whereas the lowest ED3 Modified segmentation of the other two sources were produced at the scale parameter of 10.For the non-forest gaps, such as the waterbody shown in Figure 5a,b, segmentation results from three datasets were generally acceptable.However, the waterbody segmented from the CHM (Figure 5d) extended beyond its boundary to the lakeshore and thus did not match the reference geo-object as well as those from the multispectral image (Figure 5c,e,f).This is to be expected because the spectral contrast of the waterbody and its neighboring lakeshore was much stronger than the height difference.Most of the forest gaps, as shown in Figure 6a,b, were well segmented by the CHM (Figure 6d) despite slight over-segmentation.The inclusion of the multispectral image did not improve segmentation but resulted in over-segmentation (Figure 6c,e,f), likely due to the spectral confusion between forest gaps and the neighboring tree canopies.Both Tile (e) and (f) show the optimal segmentation result through the integration of multispectral image and CHM at the scale parameter of 10.
The best canopy gap delineation was yielded by the segmentation of LiDAR derived CHM at the scale parameter of 20 with a low ED3Modified of 0.39.Averaged pixel values (i.e., spectral and height) within each geo-object segmented at this scale parameter was used for the subsequent object-based classifications.
With respect to object-based canopy gap classification, the one-way ANOVA revealed significant differences among the three classifications (Figure 7).For post-hoc multiple comparisons, the Tukey-Kramer test indicated that the overall accuracy of canopy gap classification using both spectral and height information (80.28% ± 6.16%) was significantly higher than those using individual sole sources of information (spectral: 68.54% ± 9.03%; height: 64.51% ± 11.32%) (p ≤ 0.05).In order to further interpret the classification accuracies of non-forest and forest gaps, we utilized the confusion matrices (Table 1) to export the producer's and user's accuracies (Table 2).The producer's accuracy of forest gaps using both spectral and height information were higher than those using either spectral or height information, indicating that a synergistic use of spectral and height information could reduce the omission of forest gap identification.show the reference geo-objects (blue polygons) of non-forest gaps imposed on the multispectral image and CHM, respectively.Tile (c) depicts the result of segmentation using the multispectral image at the optimal scale parameter of 10, while Tile (d) indicates the corresponding segments by segmenting the CHM at the optimal scale parameter of 20.Both Tile (e) and (f) show the optimal segmentation result through the integration of multispectral image and CHM at the scale parameter of 10.
The best canopy gap delineation was yielded by the segmentation of LiDAR derived CHM at the scale parameter of 20 with a low ED3 Modified of 0.39.Averaged pixel values (i.e., spectral and height) within each geo-object segmented at this scale parameter was used for the subsequent object-based classifications.
With respect to object-based canopy gap classification, the one-way ANOVA revealed significant differences among the three classifications (Figure 7).For post-hoc multiple comparisons, the Tukey-Kramer test indicated that the overall accuracy of canopy gap classification using both spectral and height information (80.28% ˘6.16%) was significantly higher than those using individual sole sources of information (spectral: 68.54% ˘9.03%; height: 64.51% ˘11.32%) (p ď 0.05).In order to further interpret the classification accuracies of non-forest and forest gaps, we utilized the confusion matrices (Table 1) to export the producer's and user's accuracies (Table 2).The producer's accuracy of forest gaps using both spectral and height information were higher than those using either spectral or height information, indicating that a synergistic use of spectral and height information could reduce the omission of forest gap identification.A subset of the individual canopy gap classification maps, produced by three different data sources, is illustrated in Figure 7 and the final classification map of canopy gaps (i.e., non-forest gaps and forest gaps) is depicted in Figure 8.The differences for non-forest gap classification were not so obvious when using three datasets, but the combination of spectral and height information led to a better forest gap classification (Figure 7e).The use of single data source (i.e., spectral or height information) misclassified the forest gaps as the non-forest gaps (Figure 7c,d).The independent use of height information suffered higher omission compared to the use of spectral information due to the serious confusion in height between non-forest and forest gaps.This result is consistent with the PA values of forest gaps (Table 2-spectral: 74.10% vs. height: 68.28%).A subset of the individual canopy gap classification maps, produced by three different data sources, is illustrated in Figure 7 and the final classification map of canopy gaps (i.e., non-forest gaps and forest gaps) is depicted in Figure 8.The differences for non-forest gap classification were not so obvious when using three datasets, but the combination of spectral and height information led to a better forest gap classification (Figure 7e).The use of single data source (i.e., spectral or height information) misclassified the forest gaps as the non-forest gaps (Figure 7c,d).The independent use of height information suffered higher omission compared to the use of spectral information due to the serious confusion in height between non-forest and forest gaps.This result is consistent with the PA values of forest gaps (Table 2-spectral: 74.10% vs. height: 68.28%).The performance of ED3Modified and SEI for evaluating the CHM segmentation results were compared at a series of scale parameters (i.e., 10, 20, 30, 40, 50, 60, 70, 80, 90, and 100), through relating them to the corresponding overall accuracies when using both spectral and height information for object-based classification (Figure 9).Since the higher overall accuracy indicates better classification while the lower ED3Modified or SEI indicates the better segmentation, the absolute value of correlation coefficient (|R|) was used to gauge the strength of the relationships.ED3Modified had a stronger correlation with overall accuracy (0.83) than that of SEI (0.79), suggesting ED3Modified was more related to the classification accuracy.The performance of ED3 Modified and SEI for evaluating the CHM segmentation results were compared at a series of scale parameters (i.e., 10, 20, 30, 40, 50, 60, 70, 80, 90, and 100), through relating them to the corresponding overall accuracies when using both spectral and height information for object-based classification (Figure 9).Since the higher overall accuracy indicates better classification while the lower ED3 Modified or SEI indicates the better segmentation, the absolute value of correlation coefficient (|R|) was used to gauge the strength of the relationships.ED3 Modified had a stronger correlation with overall accuracy (0.83) than that of SEI (0.79), suggesting ED3 Modified was more related to the classification accuracy.
The performance of ED3Modified and SEI for evaluating the CHM segmentation results were compared at a series of scale parameters (i.e., 10, 20, 30, 40, 50, 60, 70, 80, 90, and 100), through relating them to the corresponding overall accuracies when using both spectral and height information for object-based classification (Figure 9).Since the higher overall accuracy indicates better classification while the lower ED3Modified or SEI indicates the better segmentation, the absolute value of correlation coefficient (|R|) was used to gauge the strength of the relationships.ED3Modified had a stronger correlation with overall accuracy (0.83) than that of SEI (0.79), suggesting ED3Modified was more related to the classification accuracy.

Remote Sensing Data Processing
Owing to the increasing availability of various remote sensors, a synergistic use of remote sensing data has attracted growing attention in many applications.Data integration is now the preferred method in OBIA studies, such as individual tree based species classification [32][33][34], urban land cover extraction [35,36], and small-scale wetland mapping [37][38][39].
Data integration of object-based applications involves two steps-image segmentation and image classification.In this study, both the multispectral image and the LiDAR derived CHM were utilized in the process of canopy gap delineation and object-based classification.We found that canopy gaps can be better delineated (i.e., segmented) by the independent use of LiDAR derived CHM, rather than by the combined use of two datasets.This finding is reasonable because height information is expected to be more informative than spectral information, especially for forest gap delineation in which gap boundaries in height are sharper than in spectral features.However, the synergistic use of optical and LiDAR datasets resulted in a better classification map than the independent use of any sole data source.This result suggests that both vertical information and spectral information contributed to the separation of forest gaps and non-forest gaps/tree canopies.
These results are consistent with the existing OBIA studies in which multi-source remote sensing data have been widely applied for object-based classification, whereas rarely used for segmentation.The success of data integration is heavily dependent on the compatibility of multi-source remote sensing data, specifically the consistency in spatial, spectral, temporal, and radiometric resolutions.This study resampled the multispectral image from 0.4 m to 2 m in order to match the spatial resolution of the CHM.Otherwise, the utilized MRS and SVM algorithms would segment and classify all the layers at the finest spatial resolution (i.e., 0.4 m) reducing the computing efficiency of canopy gap identification.
Our multispectral image contain four spectral bands while the CHM has only one layer of height information.We treated these five layers (i.e., four spectral bands + one height layer) equally.The CHM was assigned a weight of 20% for both segmentation and classification while each layer of the multispectral image was weighted 20%, resulting in a total of 80% for the multispectral image.However, most of segmentation algorithms, for instance region merging [19,40,41] and watershed transformation [42][43][44], use single layer or weighted averages of multiple layers.If spectral layers introduce noise, the use of them could reduce segmentation quality, particularly in cases where they were highly weighted.This may explain why adding the multispectral image negatively affected canopy gap delineation.The majority of popular classifiers such as decision tree [45], SVM [29], and random forest [46] view multiple layers independently as a variety of features.The classifiers could choose useful features and eliminate features that introduce noise and/or have marginal impact on classification accuracy.
Simultaneous collection of multi-source data is critical for the success of data integration.Although the LiDAR data used in this study were collected two years later than the multispectral image, we still treat them as if they were acquired simultaneously because the two datasets were both acquired during the summer and because there were no extreme weather events (e.g., microbursts, ice storms) during of the intervening two years so it is fair to assume that changes in forest would not be substantial.
Another potential source of error in the data could be the radiometric resolution of the two data sets, which is usually ignored for the synergistic use of data.In this study, the multispectral image was 8-bit so the data ranged from 0 and 255, which were quite different from those of CHM (i.e., true height).Future work should investigate whether inconsistent data ranges would reduce the effectiveness of data integration for segmentation and classification.
We also found that the discrepancy measures of segmentation quality (i.e., ED3 Modified and SEI) were highly related to the indicators of classification accuracy (OA), suggesting that the quality of canopy gap delineation highly affects the accuracy of the following object-based classification.In comparison with SEI, ED3 Modified is more closely related to classification accuracy.This is understandable because ED3 Modified focuses geometric discrepancy on under-segmentation and is more tolerant to over-segmentation [26,27].Since over-segmentation would not affect object-based classification as strongly as under-segmentation, ED3 Modified is more suitable for evaluating segmentation quality from the perspective of object-based classification.In other words, ED3 Modified has the advantage over SEI to assess segmentation quality when the goal of segmentation was for classification rather than geo-object recognition.

Forest Ecosystem Management
Accurate segmentation and classification of canopy gaps by a synergistic use of optical images and LiDAR data plays a critical role in understanding forest regeneration dynamics and may help predict future forest condition [16,[47][48][49].For example, larger canopy gaps may result in the establishment of early and mid-successional species while smaller canopy gaps may promote the establishment of late-successional species [1].
The success of canopy gap identification using remote sensing data demonstrates that it is possible to investigate canopy gap dynamics if multitemporal data are available.Monitoring gap opening, closure, expansion, and displacement could help understand the role of canopy gaps in forest succession [1].For example, Yu et al. [50] used bi-temporal LiDAR derived CHMs to detect the harvested and fallen trees, over time.St-Onge and Vepakomma [51] highlighted the potential use of multi-temporal medium density LiDAR data for understanding gap dynamics in a spatially explicit manner, particularly in identifying new canopy gaps and assessing height growth.
Canopy gap delineation also assists in determining the size and distribution of gaps within a forested area (e.g., Haliburton Forest) which further relates to many important attributes commonly found in forest resource inventory, notably crown closure, stocking, and forest structure [49].Being able to populate these attributes in a semi-automated way through the utilization of efficient remote sensing image processing algorithms as opposed to subjective photo interpretation or intensive ground survey could increase efficiency and accuracy of forest resource inventory.Accurate delineation of canopy gaps should benefit subsequent efforts to delineate tree crowns, and thus contribute to species identification at individual tree level, which has proven challenging in species diverse mixed deciduous forests [52].

Conclusions
Canopy gap fraction is a critically important parameter for structurally complex forests.In this study, we incorporated both the optical multispectral image and the LiDAR derived CHM for canopy gap identification over the selected experimental site in Haliburton Forest and Wildlife Reserve.Data integration was implemented in the process of canopy gap segmentation and object-based classification.The experimental results demonstrated that the independent use of CHM yielded the best canopy gap segmentation, attributed to the lowest value of discrepancy index (i.e., ED3 Modified : 0.56 ˘0.09).Moreover, the synergistic use of multispectral image and CHM produced the more accurate gap classification (i.e., OA: 80.28% ˘6.16%) than the independent use of individual data sources (i.e., multispectral image: 68.54% ˘9.03%; CHM: 64.51% ˘11.32%).Further, the correlation between canopy gap segmentation and classification was 0.83, indicating that segmentation accuracy strongly influenced the follow-up object-based classification.The significance of this study was not limited to the improvement of canopy gap identification by data integration, but also extends to the management of forest ecosystem in terms of canopy gap dynamics.
Data integration has recently become a very promising alternative in a variety of remote sensing applications.Higher accuracy can be achieved through the synergistic use of multi-source data, however special attention should be drawn to keeping data compatible from the perspective of spatial, spectral, temporal, and radiometric resolutions.In particular, efforts should be made to avoid negative impacts of data incompatibility on the image segmentation step of OBIA.It should be noted that this study focused on the MRS algorithm and the SVM classifier with the RBF kernel as the segmentation and classification methods.Further work should be conducted in order to investigate if similar conclusions would be drawn using other segmentation and classification approaches.

Figure 1 .
Figure 1.Location of the experimental site in Haliburton Forest and Wildlife Reserve, Ontario, Canada.

Figure 1 .
Figure 1.Location of the experimental site in Haliburton Forest and Wildlife Reserve, Ontario, Canada.

Figure 2 .
Figure 2. Multispectral image (a) and LiDAR derived CHM (b) for the experimental site (under the NAD83 UTM coordinate system).Blue, red, and green polygons represent the reference non-forest gaps (2-6 m in height), forest gaps (6-10 m in height), and tree canopies, respectively.

Figure 3 .
Figure 3.The flowchart of the automated gap delineation and classification using optical and LiDAR data.

Figure 4 .
Figure 4. ED3Modified values for the canopy gap delineation results using multispectral image (IMGCHM), LiDAR derived CHM (CHMSEG), and both image and CHM (BOTHSEG) as a function of the scale parameter ranging between 10 and 100 by an interval of 10.

Figure 4 .
Figure 4. ED3 Modified values for the canopy gap delineation results using multispectral image (IMGCHM), LiDAR derived CHM (CHMSEG), and both image and CHM (BOTHSEG) as a function of the scale parameter ranging between 10 and 100 by an interval of 10.

Figure 5 .
Figure 5. Examples of non-forest gap segmentation results by different data sources.Tile (a) and (b) show the reference geo-objects (blue polygons) of non-forest gaps imposed on the multispectral image and CHM, respectively.Tile (c) depicts the result of segmentation using the multispectral image at the optimal scale parameter of 10, while Tile (d) indicates the corresponding segments by segmenting the CHM at the optimal scale parameter of 20.Both Tile (e) and (f) show the optimal segmentation result through the integration of multispectral image and CHM at the scale parameter of 10.

Figure 5 .
Figure 5. Examples of non-forest gap segmentation results by different data sources.Tile (a) and (b)show the reference geo-objects (blue polygons) of non-forest gaps imposed on the multispectral image and CHM, respectively.Tile (c) depicts the result of segmentation using the multispectral image at the optimal scale parameter of 10, while Tile (d) indicates the corresponding segments by segmenting the CHM at the optimal scale parameter of 20.Both Tile (e) and (f) show the optimal segmentation result through the integration of multispectral image and CHM at the scale parameter of 10.

Figure 6 .
Figure 6.Examples of forest gap segmentation results by different data sources.Tile (a) and (b) show the reference geo-objects (red polygons) of forest gaps imposed on the multispectral image and CHM, respectively.Tile (c) depicts the result of segmentation using the multispectral image at the optimal scale parameter of 10, while Tile (d) indicates the corresponding segments by segmenting the CHM at the optimal scale parameter of 20.Both Tile (e) and (f) show the optimal segmentation result through the integration of multispectral image and CHM at the scale parameter of 10.

Figure 6 .
Figure 6.Examples of forest gap segmentation results by different data sources.Tile (a) and (b) show the reference geo-objects (red polygons) of forest gaps imposed on the multispectral image and CHM, respectively.Tile (c) depicts the result of segmentation using the multispectral image at the optimal scale parameter of 10, while Tile (d) indicates the corresponding segments by segmenting the CHM at the optimal scale parameter of 20.Both Tile (e) and (f) show the optimal segmentation result through the integration of multispectral image and CHM at the scale parameter of 10.

Figure 7 .
Figure 7.A subset of multispectral image (a); CHM (b); and canopy gap classification map by spectral (c); height (d); and spectral + height information (e).Blue and red polygons imposed on the multispectral image (a) and CHM (b) represent the reference geo-objects of non-forest and forest gaps, respectively.Blue, red, and green pixels in the classification maps (c-e) represent the non-forest gaps, forest gaps, and tree canopies, respectively.

Figure 7 .
Figure 7.A subset of multispectral image (a); CHM (b); and canopy gap classification map by spectral (c); height (d); and spectral + height information (e).Blue and red polygons imposed on the multispectral image (a) and CHM (b) represent the reference geo-objects of non-forest and forest gaps, respectively.Blue, red, and green pixels in the classification maps (c-e) represent the non-forest gaps, forest gaps, and tree canopies, respectively.

Table 1 .
Confusion matrices of object-based canopy gap classifications by spectral, height, and both spectral and height information.

Table 1 .
Confusion matrices of object-based canopy gap classifications by spectral, height, and both spectral and height information.

Table 2 .
Accuracy parameters of object-based canopy gap classifications by spectral, height, and both spectral and height information.

Table 2 .
Accuracy parameters of object-based canopy gap classifications by spectral, height, and both spectral and height information.