Using Leaf-O ﬀ and Leaf-On Multispectral Airborne Laser Scanning Data to Characterize Seedling Stands

: Information from seedling stands in time and space is essential for sustainable forest management. To fulﬁl these informational needs with limited resources, remote sensing is seen as an intriguing alternative for forest inventorying. The structure and tree species composition in seedling stands have created challenges for capturing this information using sensors providing sparse point densities that do not have the ability to penetrate canopy gaps or provide spectral information. Therefore, multispectral airborne laser scanning (mALS) systems providing dense point clouds coupled with multispectral intensity data theoretically o ﬀ er advantages for the characterization of seedling stands. The aim of this study was to investigate the capability of Optech Titan mALS data to characterize seedling stands in leaf-o ﬀ and leaf-on conditions, as well as to retrieve the most important forest inventory attributes, such as distinguishing deciduous from coniferous trees, and estimating tree density and height. First, single-tree detection approaches were used to derive crown boundaries and tree heights from which forest structural attributes were aggregated for sample plots. To predict tree species, a random forests classiﬁer was trained using features from two single-channel intensities (SCIs) with wavelengths of 1550 (SCI-Ch1) and 1064 nm (SCI-Ch2), and multichannel intensity (MCI) data composed of three mALS channels. The most important and uncorrelated features were analyzed and selected from 208 features. The highest overall accuracies in classiﬁcation of Norway spruce, birch, and nontree class in leaf-o ﬀ and leaf-on conditions obtained using SCI-Ch1 and SCI-Ch2 were 87.36% and 69.47%, respectively. The use of MCI data improved classiﬁcation by up to 96.55% and 92.54% in leaf-o ﬀ and leaf-on conditions, respectively. Overall, leaf-o ﬀ data were favorable for distinguishing deciduous from coniferous trees and tree density estimation with a relative root mean square error (RMSE) of 37.9%, whereas leaf-on data provided more accurate height estimations, with a relative RMSE of 10.76%. Determining the canopy threshold for separating ground returns from vegetation returns was found to be critical, as mapped trees might have a height below one meter. The results showed that mALS data provided beneﬁts for characterizing seedling stands compared to single-channel ALS systems.


1.
MCI data, together with dense point clouds from mALS, will improve the classification of segments to spruce, birch, and nontrees, compared to single-channel ALS data from 1550 nm and 1064 nm in seedling stands, 2.
Leaf-off data will be more suitable for inventories of coniferous seedlings, and 3.
Lowering C th improves tree detection, classification, and height estimation.

Study Area and Establishment of Sample Plots
The study area is located in southern boreal forest zone in Evo, Finland (61.20 • N, 25.08 • E, 133-150 m above sea level; Figure 1). It mainly includes managed forests where Scots pine (Pinus sylvestris L.) and Norway spruce are the dominant tree species.
Remote Sens. 2020, 12, x FOR PEER REVIEW 3 of 20 With single-channel ALS for forest inventories at a species level, aerial images are required. However, mALS providing multichannel intensity (MCI) data which are equivalent to aerial photos can eliminate this need. This was demonstrated in the study of Yu et al. [31], where it was applied as a single-senor solution for mature forest inventories. Moreover, treetops are more likely to be hit with the increase of point density, which is important for seedling stand inventories, because treetops are closely located to one another, and penetrating between canopies is important. Hence, the features of mALS make it potentially capable of characterizing seedling stands. To the best of authors knowledge, this has not been studied yet, and understanding is limited about how well it could be applied.
This study was mainly designed to investigate the ability of mALS data to provide dense and MCI point clouds to estimate tree density and height in young (mean tree height ≤ 1.3 m, YoS) and advanced seedling (mean tree height > 1.3 m, AdS) stands. In addition, we compared characterizations of the seedling stands in leaf-off and leaf-on conditions, and optimized Cth in tree detection, classification, and height estimation. We hypothesized that: 1. MCI data, together with dense point clouds from mALS, will improve the classification of segments to spruce, birch, and nontrees, compared to single-channel ALS data from 1550 nm and 1064 nm in seedling stands, Figure 1. Map of the study area and established sample plots in young seedling stand (upper image block) and in advanced seedling stand (lower image block).
One YoS and two AdS stands from the study area were selected based on existing forest resource information. The seedling stands were selected based on tree density, which had to be higher than 2400 trees per hectare (TPH), since this density was required to establish sample plots and thin them into different densities. Five sample plots were established with an 8 m radius for YoS, and 7 sample One YoS and two AdS stands from the study area were selected based on existing forest resource information. The seedling stands were selected based on tree density, which had to be higher than 2400 trees per hectare (TPH), since this density was required to establish sample plots and thin them Remote Sens. 2020, 12, 3328 4 of 20 into different densities. Five sample plots were established with an 8 m radius for YoS, and 7 sample plots with a 10-m radius were established for AdS. The YoS sample plots were thinned approximately into the following target tree densities: 1200, 1400, 1600, 1800, and 2000 TPH. Similarly, the sample plots in AdS were thinned to the following tree densities: 1000, 1200, 1400, 1600, 1800, 2200, and 2400 TPH.
The sample plots were established between April and May 2016. The locations of sample plots were recorded using the Trimble GeoXT Global Navigation Satellite System (GNSS) device (Trimble, Sunnyvale, CA, USA). The GNSS locations were differentially corrected by applying the data from the local reference station. The expected accuracy was below 1 m in an open area. Next, thinning treatments were applied followed by measuring the tree attributes in June 2016 (Table 1). Then, the plot-level forest inventory attributes of the sample plots were compiled. All the remaining trees in YoS sample plots were Norway spruce, unlike in AdS, where an admixture of birch from 0% to 34.3% existed. The site type of the sample plots was mesic heath forest. Among the plots, the proportion of Norway spruce was predominant over birch by 91.8%.
In AdS, the tree species was determined and the diameter at breast height (DBH) was measured with steel calipers from trees with a height ≥1.3 m. The height of each tree species was measured from every third tree using an electronic hypsometer (Vertex IV, Haglöf, Långsele, Sweden). The height of the tallest tree in each sample plot was also measured. YoS measurements were the same as those of AdSs, with the exception of diameter at ground, which was measured instead of DBH, since the mean height was less than 1.3 m in the YoS sample plots (Table 1). The height of the trees in each sample plot was predicted by fitting the Näslund's height curve [32], obtained from the sample tree height measurements. The relative root mean square error (rRMSE) of the tree height prediction was 12.8% (relative bias −0.13%) and 11.4% (relative bias 0.57%) for YoS and AdS, respectively. Plot-level tree density was calculated by dividing the number of trees observed in each plot by its area, and subsequently converting the area to hectares. Tree density and height statistics are shown in Table 1. Notably, many grasses and ferns were observed during field data collection in June.

Remote Sensing Data
Remote sensing data were captured using an Optech Titan mALS scanner (Teledyne Optech, Toronto, ON, Canada) from a 500 m flight height. The data were acquired on 1 May and 12 to 14 June 2016 under leaf-off and leaf-on conditions (as leaf-off and leaf-on data, respectively). The sensor sends and records laser pulses in three channels: green (Channel (Ch) 3, 532 nm), NIR (Channel 2, 1064 nm), and short-wave infrared (Channel 1, 1550 nm). Its beam divergence in Channels 1 and 2 is about 0.35 mrad (1/e), and in Channel 3 is about 0.7 mrad (1/e). The sensor operates with a pulse rate of 250 kHz per channel.
The data were preprocessed into ground and nonground points using a standard procedure in TerraScan (TerraSolid Oy, Helsinki, Finland). Possible noises, such as the hits below the ground or above the canopy, were then removed from the data. Next, the point clouds were height normalized by subtracting the ground elevation from the height measurement of the points using a digital terrain model (DTM) created from the classified ground points of three channels separately to avoid any possible discrepancy among the channels. The DTM was created in the format of a triangulated irregular network (TIN) and ground elevation for the point was interpolated based on the TIN model. The process was performed separately for leaf-on and leaf-off data. The accuracy of the DTM was expected to be 20 cm. The average point density of all laser returns per channel and for all channels within YoS, AdS, and all plots are presented in Table 2 with leaf-off and leaf-on data. The proportion of laser returns-i.e., first/single, second, third, and last returns-in leaf-off and leaf-on is listed in Table 3 within the plots of the study area.
More than 85% of laser returns in both leaf-off and leaf-on data were obtained from first/single returns, and less than 0.20% from last returns (Table 3). Figure 2 shows an outline and summary of each subsection in methodology part.     Table 4). The intensity of laser returns was not 208 calibrated. The features were then grouped into two categories: (1) single-channel intensity (SCI) and

Creating Canopy Height Model and Delineating Tree Crowns
The canopy height models (CHMs) with a resolution of 0.1 m were created by assigning the maximum height of the first and only/single laser returns of all three channels to each raster cell. Empty pixels (gaps) in the CHM were interpolated using the focal function in Raster R package [33] using a 7 × 7 kernel window size. The process of filling gaps only affected the null pixels. Next, the CHM was smoothed three times using a low-pass mean filter with a window size of 3 × 3; subsequently, the watershed segmentation method was applied in ArcMap (version 10.3.1) to segment tree crown boundaries without setting the threshold to flow in CHM, corresponding to a Cth of 0 m. In the watershed segmentation, the flow direction of each cell of the CHM was calculated after the CHM had been negated. Then, each segment represented a watershed where imagined water would flow from adjacent cells.

Feature Extraction from Multispectral ALS Data
The maximum height (H max ) of each segment was extracted from the highest point of all laser returns in each segment. Simultaneously, 208 spectral features were extracted from the intensity of the laser returns above 6 C th s (0, 0.2, 0.4, 0.6, 0.8, and 1 m; schematized in Figure 3) for each channel to separate Norway spruces from other vegetation ( Table 4). The intensity of laser returns was not calibrated. The features were then grouped into two categories: (1) single-channel intensity (SCI) and (2) multichannel intensity (MCI).
The SCI features include the minimum, maximum, mean, standard deviation, coefficient of variation, range, skewness, and kurtosis of the intensity, as well as the intensity percentiles from 5% to 95% in 5% increments for each channel (Table 4). MCI features were calculated using the ratios of the intensity features of each channel divided by the sum of all channels as ratio (R i F ), the subtraction of Channels 2 and 3, divided by the sum of Channels 2 and 3 (as green Normalized Differential Vegetation Index, gNDVI F ) and dividing Channel 2 by Channel 3 (as green simple ratio, gSR F ). In total, 78 SCI features, together with 130 MCI features, were extracted for each of the six C th per segment. The features were  The SCI features include the minimum, maximum, mean, standard deviation, coefficient of variation, range, skewness, and kurtosis of the intensity, as well as the intensity percentiles from 5% to 95% in 5% increments for each channel (Table 4). MCI features were calculated using the ratios of the intensity features of each channel divided by the sum of all channels as ratio (R i F), the subtraction of Channels 2 and 3, divided by the sum of Channels 2 and 3 (as green Normalized Differential Vegetation Index, gNDVIF) and dividing Channel 2 by Channel 3 (as green simple ratio, gSRF). In total, 78 SCI features, together with 130 MCI features, were extracted for each of the six Cth per segment. The features were statistical measurements applied to both SCI and MCI features for each segment. Table 4 provides definitions of the SCI and MCI features. Table 4. List of all features derived from spectral information of the laser returns Superscripts (i) 1, 2, and 3 represent channel numbers and subscript F refers to single-channel intensity feature used. Percentiles of intensity from 5% to 95% in 5% increments Multichannel intensity (MCI) features R i F = I i F/(I 1 F + I 2 F + I 3 F) Ratios of intensity features gNDVIF = (I 2 F -I 3 F)/(I 2 F + I 3 F) Green normalized differential vegetation index (gNDVI) gSRF = I 2 F/I 3 F Green simple ratio vegetation index (gSR)

Selection of Training and Validation Data
A total of 217 (67, 115 and 35 for birch, spruce and nontree, respectively) and 334 (113, 144 and 77 for birch, spruce and nontree, respectively) segments located outside the field sample plot boundaries were visually labeled in the leaf-off and leaf-on data, respectively. During visual labelling, UAV-RGB orthomosaics with a resolution of 2.5 cm captured on 9 and 11 May (leaf-off) and 29 June (leaf-on) in 2016 over the same study area [7] were used as a background image. Hmax was also used as an auxiliary attribute for differentiating birch from grasses and bushes to distinguish short leafless birches and tall bushes in leaf-off data. The segments were labeled as three classes: Norway spruce, birch, and nontree (stumps/deadwood, bush/grass, and rock). The labeled segments were then used as training data for feature importance analysis and classification.  Table 4. List of all features derived from spectral information of the laser returns Superscripts (i) 1, 2, and 3 represent channel numbers and subscript F refers to single-channel intensity feature used.

Feature Definition
Single-channel Intensity (SCI) features

Selection of Training and Validation Data
A total of 217 (67, 115 and 35 for birch, spruce and nontree, respectively) and 334 (113, 144 and 77 for birch, spruce and nontree, respectively) segments located outside the field sample plot boundaries were visually labeled in the leaf-off and leaf-on data, respectively. During visual labelling, UAV-RGB orthomosaics with a resolution of 2.5 cm captured on 9 and 11 May (leaf-off) and 29 June (leaf-on) in 2016 over the same study area [7] were used as a background image. H max was also used as an auxiliary attribute for differentiating birch from grasses and bushes to distinguish short leafless birches and tall bushes in leaf-off data. The segments were labeled as three classes: Norway spruce, birch, and nontree (stumps/deadwood, bush/grass, and rock). The labeled segments were then used as training data for feature importance analysis and classification.
Using a similar procedure as described above, a total of 92 (28, 58 and 6; for birch, spruce and nontree, respectively) and 96 (22, 48 and 26; for birch, spruce and nontree, respectively) segments were labeled as validation data located inside sample plot boundaries in leaf-off and leaf-on data, respectively. The aim was to create a number of validation data that was approximately one-third the number of training data for both epochs. In other words, they are two independent and separate datasets located inside (validation) and outside (training) of the plot boundaries.
In both the training and validation data, segments obtained from the watershed method using a C th of 0 m were used for labeling in leaf-off and leaf-on conditions. Finally, the labels were spatially joined to segments derived from features of other C th applied in leaf-off and leaf-on, respectively.

Feature Importance Analysis and Classification
Before the feature importance analyses, features with different C th s were subgrouped based on the intensity groups according to their SCI-Ch 1 , SCI-Ch 2 , SCI-Ch 3 , MCI-R i F , MCI-gNDVI F , and MCI-gSR F features. Next, the most important features from each subgroup for the separation of Norway spruces from birches, Norway spruces from nontrees, and birches from nontree classes were selected using a random forests (RF [34]) classifier with 1000 decision trees. The RF was implemented using version 0.20.1 of Scikit-learn [35]. The three most important features from each subgroup were preselected. The next step was to investigate the correlations between the selected three features with all other features in each subgroup. When detecting correlations above 0.8, less important features were removed from further analyses. Finally, the most important and uncorrelated features were used to train an RF model to classify the identified segments into Norway spruce, birch, and nontree classes.
In addition to the above-mentioned classification used for the selected features (combining all three channels; MCI), two other classifications were also run using SCI-Ch 1 and SCI-Ch 2 data, because they are the most commonly used wavelengths in single-channel ALS systems [36]. The feature importance analysis and classifications were carried out identically for each of the 36 datasets (6 C th s with three intensities in leaf-off and leaf-on data).

Tree Density and Height Retrieval
After predicting classes, the plot-level tree density and height of the Norway spruces were calculated. The tree density was calculated by dividing the predicted number of Norway spruce in each plot by its area, and subsequently converting the area to hectare. The tree height was derived by calculating arithmetic mean height of H max s of the segments classified as Norway spruce within each plot.

Accuracy Evaluations
The classification accuracy was evaluated by calculating the overall accuracy (OA), recall (i.e., omission error), and precision (i.e., commission error) using the validation data. For simplicity, only the mean of the recall and precision per class are reported.
The plot-level tree density of Norway spruce trees was compared to field-measured tree density, and the bias and RMSE were calculated using Equations (1)-(4). To evaluate the accuracy of remotely sensed tree height, the estimation of the plot-level mean tree height (derived from H max ) was compared with its corresponding height from field data using: where n is the number of plots, y i is the value estimated from the field data for plot i,ŷ i is the aggregated value from the detected trees for plot i, and y is the mean of the variable in the field data. Additionally, Pearson correlation coefficient (r) was calculated, which measures the linear relationship between the remotely sensed and field data as x and y, respectively (Equation (5)).
The tree density and height were analyzed and are reported for all sample plots (n = 12), YoS (n = 5), and AdS (n = 7) for both leaf-off and leaf-on conditions using mALS data and 6 C th s of 0, 0.2, 0.4, 0.6, 0.8, and 1 m.

Feature Importance Analysis
For SCI features, intensity percentiles between 10% and 65% were selected as the most important features for separating classes from each other in leaf-off data. The most important features nominated by RF for SCI varied between minimum intensity (I min ) and the 70% percentile for leaf-on data ( Table 5). Table 5. The most important features selected for classification in leaf-off and leaf-on data. Green-colored features represent the most important and uncorrelated features. The descriptions of the features can be found in Table 4. The numbers in parentheses represent the feature importance score.

Leaf-Off
Leaf-On For MCI features, I min and percentiles close to 50% (I 45 and I 50 ) features were selected as the most important features in leaf-off data. I min , middle, and upper intensity percentiles (I 50 , I 55 , I 65 , and I 75 ) were selected for leaf-on datasets (Table 5).
Approximately half (9 and 8 features of 18 in leaf-off and leaf-on, respectively) of the RF-nominated features were either highly correlated (r ≥ 80%) or the same features were selected to separate different classes from each other; consequently, they were dropped after the correlation analysis step (Table 5). To explain with an example, in the SCI-Ch3 subgroup in leaf-off, the same feature (I 30 ) was nominated twice (to separate Norway spruce from nontrees and birch from nontrees), and it was highly correlated with I 65 (r = 91) (results not shown). Therefore, I 30 was selected due to its higher importance score (0.143) compared to I 65 (0.096).
For MCI features, I min and I 45 were selected for both gNDVI and gSR in leaf-off data, with I min and I 70 in leaf-on data. I 50 2 was selected for MCI-R i F in both leaf-off and leaf-on. Overall, minimum and middle intensity percentiles were more useful.
As illustrated in Figure 4, the distribution of the selected most important and uncorrelated features demonstrated that the selected features were suitable to separate Norway spruce, birch, and nontree classes. As an example, the value distribution of MCI-I min 3 and I 20 (in SCI-Ch3) features were descriptive in leaf-off and leaf-on data, respectively.  Table 5). The boxes show the mean and first and third quantiles in the box, as well as the minimum and maximum values in line, using validation data with a canopy threshold (Cth) of 0.4 m in leaf-off (A) and leaf-on (B) conditions. Features of single-and multi-channel intensity (SCI and MCI, respectively) data are plotted in left and right, respectively. Explanations for features are provided in Table 4.

Accuracy of Classification
As presented in Table 6, using MCI features, the OA of classification generally improved in the leaf-off condition from 83.70% to 96.74% as Cth increased from 0 m to 0.8 m. Similarly, in leaf-on conditions, the OA improved from 77.89% to 92.54% as Cth increased from 0 m to 1 m. In other words, in leaf-on conditions, OA values increased as Cth increased, whereas in leaf-off conditions, the initial increase of Cth from 0 to 0.2 m resulted in a jump of about 10% in OA, followed by a similar OA when Cth increased.
Using features from only SCI-Ch1 for classification, OA values ranged from 71.74% to 87.36% in leaf-off and from 46.27% to 58.95% in leaf-on conditions with Cth from 0 to 1 m. Only using features from SCI-Ch2 yielded OA values between 44.57% and 82.76% in leaf-off, and from 59.70% to 69.47% in leaf-on (Table 6).  Table 5). The boxes show the mean and first and third quantiles in the box, as well as the minimum and maximum values in line, using validation data with a canopy threshold (C th ) of 0.4 m in leaf-off (A) and leaf-on (B) conditions. Features of single-and multi-channel intensity (SCI and MCI, respectively) data are plotted in left and right, respectively. Explanations for features are provided in Table 4.

Accuracy of Classification
As presented in Table 6, using MCI features, the OA of classification generally improved in the leaf-off condition from 83.70% to 96.74% as C th increased from 0 m to 0.8 m. Similarly, in leaf-on conditions, the OA improved from 77.89% to 92.54% as C th increased from 0 m to 1 m. In other words, in leaf-on conditions, OA values increased as C th increased, whereas in leaf-off conditions, the initial increase of C th from 0 to 0.2 m resulted in a jump of about 10% in OA, followed by a similar OA when C th increased.
Using features from only SCI-Ch1 for classification, OA values ranged from 71.74% to 87.36% in leaf-off and from 46.27% to 58.95% in leaf-on conditions with C th from 0 to 1 m. Only using features from SCI-Ch2 yielded OA values between 44.57% and 82.76% in leaf-off, and from 59.70% to 69.47% in leaf-on (Table 6).
Leaf-off data provided more accurate classification results compared to leaf-on data, irrespective of the input data (MCI, SCI-Ch1, or SCI-Ch1; Table 6). In the leaf-off condition, MCI data improved the OA by~10% and~18% compared to SCI-Ch1 and SCI-Ch2, respectively. Similarly, MCI improved classification accuracy by 33% and 21% in leaf-on conditions compared to SCI-Ch1 and SCI-Ch2, respectively. Figure 5 shows the classified treetops within in an advanced seedling plot (T2G7) in leaf-off condition using Cth 0.4 m. Precision and recall had the same trends as OA, being most accurate in MCI in both epochs. Table 6. Accuracy of classification when canopy threshold (C th ) varied from 0 to 1 m in leaf-off and leaf-on conditions using multichannel intensity (MCI) features and only single-channel intensity (SCI-Ch1 and only SCI-Ch2). OA: overall accuracy; recall and precision: mean precision and recall values of all three (Norway spruce, birch, and nontree) classes, respectively. eaf-off data provided more accurate classification results compared to leaf-on data, irrespe input data (MCI, SCI-Ch1, or SCI-Ch1; Table 6). In the leaf-off condition, MCI data impro A by ~10% and ~18% compared to SCI-Ch1 and SCI-Ch2, respectively. Similarly, MCI impro fication accuracy by 33% and 21% in leaf-on conditions compared to SCI-Ch1 and SCIctively. Figure 5 shows the classified treetops within in an advanced seedling plot (T2G ff condition using Cth 0.4 m. Precision and recall had the same trends as OA, being ate in MCI in both epochs.

Tree Density Estimation
The density of Norway spruce trees was most accurately estimated in the leaf-off condition for all C th values among all plots. It was most accurate with the C th of 0.4 m, yielding an rRMSE of 37.9% and 56.96% (rBias: −23.44% and 44.76%) in leaf-off and leaf-on conditions, respectively ( Figure 6; Table 7). It was overestimated in leaf-off data from the C th of 0.0 to 0.4 m (rBias: −52.02% to −18.99%), but was underestimated in higher C th values and all C th values in leaf-on data. Density (unit: tree per hectare) of Norway spruce seedlings in leaf-off (a) and leaf-on (b) conditions, separating plots in young seedling stand (YoS, n = 5) and advanced seedling stand (AdS, n = 7) in canopy thresholds of 0.4 m. Table 7. Evaluation of the accuracy of Norway spruce tree density among all (n = 12), young seedling (YoS, n = 5), and advanced seedling (AdS, n = 7) plots in leaf-off and leaf-on conditions in each of the canopy thresholds (Cths). Unit: trees per hectare.    Table 7. Evaluation of the accuracy of Norway spruce tree density among all (n = 12), young seedling (YoS, n = 5), and advanced seedling (AdS, n = 7) plots in leaf-off and leaf-on conditions in each of the canopy thresholds (C th s). Unit: trees per hectare.

Leaf-Off
Leaf-On Density (unit: tree per hectare) of Norway spruce seedlings in leaf-off (a) and leaf-on (b) conditions, separating plots in young seedling stand (YoS, n = 5) and advanced seedling stand (AdS, n = 7) in canopy thresholds of 0.4 m.
Comparing results among AdS stands, the 1.0 m C th outperformed the other C th values with an underestimation of only 4.03% and 18.33% in leaf-off and leaf-on conditions, respectively. The 0.4 m C th provided the most accurate estimations for the YoS plots, with an rRMSE of 46.49% and 78.47% in leaf-off and leaf-on, respectively. Notably, the estimation had the same accuracy for both 0.2 and 0.4 m C th in leaf-off (rRMSE 46.49%).
The estimation of Norway spruce tree density was more accurate in AdS stands than YoS. For example, the rRMSE in leaf-off with a C th of 0.4 m was 26.83% in AdS and 46.49% in YoS (Table 7).

Tree Height Estimation
Regardless of C th , the height estimation of Norway spruce trees among all plots was more accurate in the leaf-on compared to the leaf-off condition. The most accurate result was achieved in leaf-on, with a C th of 0.2 m with an rRMSE of 10.76% (rBias: 6.88%) (Figure 7, Table 8). The lowest rBias with the 1 m C th (rBias 1.29%) was obtained as a result of overestimation in YoS and underestimation in AdS plots neutralizing each other. Also, correlation in the 0.2 m C th was highest (r = 0.98). In the leaf-off condition, however, the C th values of 0.6 and 0.8 m yielded the most accurate height estimation for Norway spruce with an rRMSE of 18.93%.
Remote Sens. 2020, 12, x FOR PEER REVIEW 13 of 20 The estimation of Norway spruce tree density was more accurate in AdS stands than YoS. For example, the rRMSE in leaf-off with a Cth of 0.4 m was 26.83% in AdS and 46.49% in YoS (Table 7).

Tree Height Estimation
Regardless of Cth, the height estimation of Norway spruce trees among all plots was more accurate in the leaf-on compared to the leaf-off condition. The most accurate result was achieved in leaf-on, with a Cth of 0.2 m with an rRMSE of 10.76% (rBias: 6.88%) (Figure 7, Table 8). The lowest rBias with the 1 m Cth (rBias 1.29%) was obtained as a result of overestimation in YoS and underestimation in AdS plots neutralizing each other. Also, correlation in the 0.2 m Cth was highest (r = 0.98). In the leaf-off condition, however, the Cth values of 0.6 and 0.8 m yielded the most accurate height estimation for Norway spruce with an rRMSE of 18.93%.    Comparing the results among AdS plots, the C th of 1.0 m with underestimation of 3.33% and C th of 0.6 and 0.8 m with underestimation of 16.99% yielded the most accurate leaf-on and leaf-off data, respectively. In YoS, however, the most accurate result was obtained using a C th of 0.2 m in leaf-on, with underestimation of 0.88%.

Feature Importance Analysis and Classification
We aimed to explore the most important and uncorrelated ALS intensity features for the classification of individual segments. Nearly half of the most important intensity features, nominated by RF, were observed to be highly correlated (r ≥ 80%) or were repeatedly selected to separate classes from each other. This correlation issue was also documented by Shi et al. [37], where 60% of their ALS features had high correlation (r > 70%). I min , middle, and upper intensity percentiles (I 50 , I 55 , I 65 , and I 75 ) were most useful for leaf-on datasets ( Table 5). Mean and high intensity percentile values were shown to provide the most information for species classification by Shi et al. [37] and Axelsson et al. [38], respectively.
Using MCI data improved the OA of classification by approximately 10% and 33% in leaf-off and leaf-on conditions for all C th compared to SCI-Ch1 and SCI-Ch2, which proves our hypothesis about the ability of mALS features in classification. For example, the OA of 94.57% was obtained in leaf-off MCI data with C th 0.6 and 0.8 m, and was 84.78% and 81.52% when using only SCI-Ch1 and SCI-Ch2, respectively (Table 6). Our findings demonstrated the advantage of the combined use of MCI features over SCI-Ch1 and SCI-Ch2 in classifying seedlings. Many other studies reported similar results but in mature forests [31,36,[38][39][40]. To the best of our knowledge, mALS data have not been used to characterize seedling stands; hence, here we compared our findings with those of similar studies in mature forests. For example, Yu et al. [31] used mALS data to classify Scots pine, Norway spruce, and birch within 22 sample plots (32 × 32 m) containing 1903 mature trees, and achieved better accuracy (OA = 86%) using MCI features. Notably, OA is affected by other factors such as species composition, stand structure, age, and method of selecting best features, which differed among the various studies. Moreover, the intensity of laser returns was not calibrated in the present study, firstly because the question of whether intensity correction can improve the results could be a future topic of investigation, and secondly, because the use of MCI features in this study was intended to serve to eliminate possible variations in intensity.
Comparing SCI-Ch2 and SCI-Ch1 in the leaf-on condition, our findings showed more accurate OA when using SCI-Ch2 (OA of 69.47% with C th of 0.2 m), which could be due to the higher reflectance of vegetation in the NIR wavelength in the leaf-on condition.
The combined use of MCI features resulted in more accurate classification results. The most accurate classification yielded a mean precision of 0.93 using leaf-off MCI features with Cth values of 0.6 and 0.8 m, which was comparable with mean precision of 0.87 [14] and 0.81 [11]. Feduck and Mcdermid [14] used leaf-off UAV-RGB imagery to detect coniferous seedlings from nonseedling objects in harvested and replanted stands using a three-step object-based method, and Fromm et al. [11] used leaf-off and leaf-on UAV-RGB imagery to detect coniferous seedlings using a faster region-CNN. As we classified segments into more classes (Norway spruce, birch, and nontrees) than the two aforementioned studies, the fact that we obtained higher accuracies emphasizes the ability of the input data and method used in our study. Notably, the field measured seedlings in the other studies were different from ours in terms of number and size.
According to our findings, increasing C th improved the accuracy of classification, which could be due to minimizing the effect of ground returns and removing low-height segments. This was clearly observed, as the C th of 0 m yielded the worst results. Increasing C th above 1 m also led to the omission of shorter seedlings; thus, we focused on optimizing C th between 0 and 1 m.
The OA obtained with SCI-Ch1 and SCI-Ch2 when using the highest C th (1 m) in leaf-off condition were 87.36% and 82.76%, respectively, whereas a higher OA of 83.70% was already achieved when using MCI with the lowest C th (0 m; Table 6). A similar pattern was observed in the leaf-on condition. We concluded that using mALS makes it possible to perform detections with a smaller C th compared to both SCI-Ch1 and SCI-Ch2, due to the higher efficiency of mALS compared to SCI-Ch1 and SCI-Ch2 in the classification of vegetation. mALS demonstrated more reliable potential for classification compared to both SCI-Ch1 and SCI-Ch2 in characterizations of small seedlings.

Tree Density Estimation
Dense point clouds from ALS, processed with the ITD method, produced accurate density estimates of the seedling stands. The tree density of Norway spruce seedlings was most accurately predicted in leaf-off data using a C th 0.4 m (rRMSE: 37.9%). Our results also showed that mALS data outperformed in leaf-off compared to leaf-on for tree density estimation of Norway spruce seedlings.
The ITD method (watershed segmentation) was used to detect tree crowns in this study, which, to the best of our knowledge, has not yet been employed in characterizing seedling stands using ALS data. Two similar studies in seedling stands using ALS data with the ABA method in tree density estimation of seedling stands reported rRMSE values of 53.4% and 42%, respectively [9,27]. The improvement in accuracy achieved in this study could be firstly attributed to the use of the ITD method; secondly, the higher point density (~57 points/m 2 ) used in this work compared to those of approximately 5 and 1.2 points/m 2 applied in the two other studies, respectively, would be expected to improve the estimates. Therefore, dense point clouds from mALS can also enhance tree detection in seedling stands. Notably, the increase in accuracy in leaf-off conditions can be attributed to the higher point density (60.10 points/m 2 ) than in leaf-on (57.17 points/m 2 ).
Another seedling stands study at the stand level used low-density (~0.7 points/m 2 ), single-channel ALS with the ABA method and predicted the total tree density with an rRMSE of 47% [26]. According to Puliti et al. [9], a decrease in RMSE occurs when plot-level estimates are scaled to the stand level. Korhonen et al. [29] reported that using stand-level ABA with ALS usually results in higher accuracy than plot-level estimates, because random errors cancel each other out. Therefore, as our tree density estimation was at the plot level, the achieved rRMSE of 37.9% compared to the rRMSE of 47% obtained in the study of Ørka et al. [26] at the stand level further endorses the capability of the dense mALS data processed with the ITD method in this study. Our method also pursues plot-level estimations because more dense ALS data are becoming available.
Tree density estimation was more accurately predicted in AdS than YoS (Table 7), potentially because the shortest tree in AdS (minimum height of 1.57 m) was above the highest C th (1 m) used. Therefore, theoretically, selecting any C th (0-1 m), no trees were cut out in AdS for the classification phase, because all trees were already above 1 m. The situation was different in YoS, where the tree height ranged from 0.73 to 1.87 m, overlapping the C th range and allowing the change in C th to affect the tree density estimation by omitting trees shorter than the applied C th (Table 1).
The sharp treetops in both YoS and AdS lowered the chance of being hit by the laser. In YoS, however, the treetops fell within the range of the applied C th , whereas in AdS, the treetops were already above all the C th values, which caused the omission of treetops to be more influential in YoS, resulting in higher underestimation of tree density as C th increased. Note that when C th was smaller, overestimation was observed. Therefore, according to our C th optimization, the most accurate tree density in YoS was obtained with a C th of 0.4 m (rRMSE: 46.49% and rBias: −28.78%). In AdS, however, 1 m was the optimum C th both in leaf-off and leaf-on conditions (rRMSE 6.23% and 18.33%, respectively). Therefore, applying a C th of 1 m is recommended for seedling stands with minimum height above 1.5 m, as supported by our results for AdS.
Our tree density estimation in AdS and YoS outperformed the values reported by Imangholiloo et al. [7], who used the same ITD method as ours and UAV-PPC and hyperspectral data. They reported rRMSE values of 19.2% and 58.2% for AdS and YoS in leaf-on data, respectively. In this study, we obtained an rRMSE of 6.23% with a C th of 1.0 m for AdS, and an rRMSE of 46.49% in YoS with a C th of 0.4 m, both in leaf-off. The better results in this research may be due to the dense point clouds and the penetration of ALS inside the canopy, especially in dense plots.
In our characterizations of seedling stands, we achieved almost the same rRMSE using ALS as studies using UAVs. For example, Puliti et al. [9] and Imangholiloo et al. [7] reported rRMSE values of 36.3% and 38.1% using ABA and ITD methods, respectively. Reaching the same accuracy (rRMSE 37.9% in leaf-off, C th 0.4 m) reflects the superior ability of ALS, i.e., it provides advantages in penetrating the canopy, and is independent of direct sunlight, clear sky, and the ITD method used in our study.
In this study, we assessed the effect of different C th values. The results of C th optimization suggested C th values of 0.4 and 1.0 m for YoS and AdS, respectively, to estimate tree density. These values are almost equal with the selected C th values of 0.5 and 1.0 m for YoS and AdS, respectively, in the UAV study of Imangholiloo et al. [7]. We also found a C th of 0.4 m to be the optimum C th for all plots, which was the same as that obtained in the ALS study by Korpela et al. [39], who selected a C th of 0.4 m after testing C th in the range of 0.1 to 0.5 m. Ørka et al. [26] also used three C th values of 0, 0.5, and 1.0 m, employing ALS to predict tree density in regeneration forests. They achieved the most accurate tree density estimation using a 0.5 m C th .
In our study, misclassification of nontrees (especially bushes and grasses) as Norway spruces also affected the tree density estimation. For instance, with a C th of 0.4 m, the leaf-off data commissions (precision) of birch, Norway spruce, and nontree classes were 0.89%, 1.00%, and 0.67%, respectively (results not shown). Carefully checking the confusion matrix of the data, in the validation data (92 segments), all Norway spruces (58) were correctly classified; however, one nontree and one birch segment were misclassified as Norway spruce, leading to overestimation of Norway spruce tree density (rBias: −23.44%).

Tree Height Estimation
Dense point clouds from mALS analyzed with ITD methods produced highly accurate height estimation in seedling stands. The height of Norway spruce was most accurately predicted in leaf-on data using a C th of 0.2 m (rRMSE: 10.76%). The novel use of leaf-off and leaf-on mALS data in seedling stands demonstrated that leaf-on data outperformed leaf-off data for tree height estimation of Norway spruce seedlings.
A few studies have assessed trees height in seedling stands using ALS data with ABA methods. The heights of seedling stands were reported with rRMSEs of 28% [26] and 32.0% [9] at the stand and plot levels, respectively. The improvement in height estimation accuracy achieved in this study could mainly be attributed to the higher point density (~57 points/m 2 ) used in our research compared to the point density of approximately 0.7 and 5 points/m 2 applied in the two others, respectively. Therefore, dense point clouds from ALS also enhance the tree height estimation in seedling stands studies. In addition to the dense point clouds used, the difference could also be due to the ITD method used in this study.
The tree height was underestimated by 6.88% in leaf-on data with a C th of 0.2 m. Korpela et al. [28] also used the ITD method with leaf-on ALS and aerial imagery in seedling stands, and obtained underestimations of tree height of between 20% and 40%. Their study area was more complex (i.e., higher tree density and classification classes) than that in this study. The point density of 57.17 points/m 2 used in this study was higher than theirs, i.e., 6-9 points/m 2 , and they used spectral data from aerial images instead of our use of intensity data from mALS.
Two similar studies, Vepakomma et al. [8] and Imangholiloo et al. [7], using the ITD method and UAV-PPC data instead of ALS, underestimated height by 0.39 and 0.16 m (6.90%), respectively. Reaching the same accuracy (bias: 0.16 m in leaf-on, C th 0.2 m) reflects the superior ability of ALS, as it operates independent of direct sunlight and clear sky. Goodbody et al. [15] and Puliti et al. [9] applied UAV in seedling studies using OBIA and ABA methods, achieving an RMSE of 0.92 m and an rRMSE of 30.9%, respectively. The improvement (RMSE: 0.25 m and rRMSE: 10.76% in leaf-on, C th 0.2 m) could be due to the ITD method and the dense point clouds used in this study.
In terms of optimum C th for tree height estimation, the C th values of 0.2 m, and 0.2, and 0.6 m were found to be optimal for YoS and AdS, respectively. These values are lower than the 0.5 and 1.0 m C th values for YoS and AdS, respectively, selected by Imangholiloo et al. [7]. Similarly, in terms of optimum C th among all plots, the C th of 0.2 m in this study was lower than the 0.4 m used by Korpela et al. [28]. The lower C th could be due to the remaining (correctly classified) tall spruces, thereby increasing the mean plot height. Ørka et al. [26] used three C th values of 0, 0.5, and 1.0 m for ALS point clouds to predict tree-height-related attributes in regeneration forests. They achieved the most accurate height estimation with a C th of 0 m, which is close to the C th value of 0.2 m in our study.
Another factor potentially influencing height estimation is the accuracy of Näslund's model in predicting heights of field data, as well as the accuracy of the DTM used for the height normalization of ALS point clouds. If tree height is measured at the individual tree level during field surveys, Näslund's model would not be needed and the ALS-estimated tree heights could be assessed at the individual tree and plot levels. More important than the previously mentioned factors, possible tree height growth between leaf-off ALS data acquisitions and field data collection (time lag was about 45 days) could have caused the higher underestimation of height in leaf-off than in leaf-on conditions.
Another issue in the tree height estimations of YoS was their higher vulnerability to the influence of DTM errors compared to AdS. When height normalization of point clouds was conducted, we assumed that errors affected over-or under-estimations of YoS heights more than for AdS. To illustrate, if DTM has a 20 cm error, for example, its proportional error to a tree with a height of 1 m (20%) would be higher than for a tree with a height of 20 m (1%). As a result, the error in DTM for height normalization can influence tree density estimation, especially for YoS. This consequently affects height estimation, because trees are relatively smaller in YoS than AdS; hence, the error in the DTM influences the estimation proportional to their short height more significantly.

Comparing Leaf-Off and Leaf-On Data
We observed that the mALS data acquired in leaf-off conditions produced the most accurate estimates for classification and tree density. Leaf-on data produced more accurate results for tree height estimation.
For classification in leaf-off and leaf-on conditions, the advantage seen from the use of leaf-off data in this study was in line with the findings of Kim et al. [40], who used leaf-off and leaf-on ALS data for species classification of mature broadleaved and coniferous trees. Their leaf-off data yielded a higher OA (83.4%) than leaf-on data (73.1%). The outperformance of leaf-off ALS data was supported by other studies that used Optech Titan mALS data for species classification of trees in mature forests [31,38]. The more accurate OA in the leaf-off condition in our study may have occurred because coniferous trees are green in leaf-off, while other classes (birch, grass, etc.) are leafless. Therefore, the classification was easier in leaf-off than in leaf-on conditions, in which both tree species (Norway spruce and birch) and nontree classes were green, and harder to separate. Leaf-off MCI data were more favorable for classification than leaf-on because the interference of grass and bushes was eliminated in the leaf-off condition. Classification was closely related to the accuracy of estimating tree density; thus, leaf-off also produced more accurate results compared to leaf-on data.
In terms of height estimation of Norway spruce seedlings, leaf-on data yielded more accurate results. Compared with mature forests, leaf-off ALS data analyzed with ABA outperformed leaf-on data [41][42][43]. The decision to use leaf-off data was, however, in contrast with our findings that showed higher suitability of leaf-on data for height estimation. This finding could be due to height growth between leaf-off ALS data collection and field height measurement during leaf-on time (time lag of 45 days), as explained in Section 4.3.

Conclusions
The ability of mALS compared to single-channel ALS data to characterize tree attributes (species, density, and height) in seedling stands in YoS, AdS, and all plots was assessed in this study. The data were obtained in leaf-off and leaf-on conditions. The optimal C th for each of the estimations was addressed separately.
According to our classification results, the MCI data produced more accurate results compared to both SCI-Ch1 and SCI-Ch2 for characterizations of small seedlings (YoS). The use of MCI data also allowed detection using a lower C th compared to both SCI-Ch1 and SCI-Ch2. The reason for this finding probably lies in the higher efficiency of MCI than SCI in the detection of vegetation. It is critical to adjust C th depending on the tree height in seedling stands, since, based on the results, the classification accuracy improved with increasing C th .
Our results also demonstrated the capability of mALS for tree density estimations in seedling plots. The results were further improved with the use of the ITD method. A C th of 0.4 m yielded the most accurate tree density estimation for all plots, whereas C th values of 0.4 and 1.0 m yielded the best results for YoS and AdS, respectively. Overall, tree density was estimated more accurately in AdS compare to YoS, and in the leaf-off compared to the leaf-on condition. Leaf-on conditions, however, resulted in more accurate tree height estimates. The C th of 0.2 m resulted in the most accurate tree height estimates for all plots, whereas C th values of 0.2 and both 0.2 and 0.6 m resulted in the most accurate results for YoS and AdS, respectively. We concluded that mALS data are a reliable resource for undertaking RS-based inventories of seedling stands. Further research with a larger number of sample plots including diverse seedling species, densities, and heights is required. Generally, leaf-off data are recommended, since the data can additionally provide accurate DTMs, if not otherwise available.