Large-Scale Mapping of Tree Species and Dead Trees in Šumava National Park and Bavarian Forest National Park Using Lidar and Multispectral Imagery

Knowledge of forest structures—and of dead wood in particular—is fundamental to understanding, managing, and preserving the biodiversity of our forests. Lidar is a valuable technology for the area-wide mapping of trees in 3D because of its capability to penetrate vegetation. In essence, this technique enables the detection of single trees and their properties in all forest layers. This paper highlights a successful mapping of tree species—subdivided into conifers and broadleaf trees—and standing dead wood in a large forest 924 km2 in size. As a novelty, we calibrate the critical stopping criterion of the tree segmentation based on a normalized cut with regard to coniferous and broadleaf trees. The experiments were conducted in Šumava National Park and Bavarian Forest National Park. For both parks, lidar data were acquired at a point density of 55 points/m2. Aerial multispectral imagery was captured for Šumava National Park at a ground sample distance (GSD) of 17 cm and for Bavarian Forest National Park at 9.5 cm GSD. Classification of the two tree groups and standing dead wood—located in areas of pest infestation—is based on a diverse set of features (geometric, intensity-based, 3D shape contexts, multispectral-based) and well-known classifiers (Random forest and logistic regression). We show that the effect of underand oversegmentation can be reduced by the modified normalized cut segmentation, thereby improving the precision by 13%. Conifers, broadleaf trees, and standing dead trees are classified with overall accuracies better than 90%. All in all, this experiment demonstrates the feasibility of large-scale and high-accuracy mapping of single conifers, broadleaf trees, and standing dead trees using lidar and aerial imagery.


Introduction
Forest inventories based on remote sensing data vary in terms of scale, sensors, and calculated forest structural attributes. For example, Latifi and Heurich [1] reported that lidar point clouds advantageously fused with optical imagery are the most prominent choices for inventory of forest structural variables at the landscape and regional scales. Spatial distribution of trees and standing or fallen dead trees are key forest variables for estimation of forest attributes such as aboveground biomass and growing stock. Aside from area-based methods, tree-level approaches estimate forest inventory parameters utilizing segmented single trees. The study of Latifi et al. [2] compared both approaches for a temperate forest area of around 300 km 2 in a large-scale experiment and showed that the cost of data collection could be reduced by 90% compared to a conventional forest inventory.
For single tree detection, a huge set of raster-based approaches are available that use the canopy height model (CHM) as basic information [3]. Aside from these low-level methods, more sophisticated approaches operate on the point cloud instead of using the CHM. Strîmbu and Strîmbu [4] present a three-dimensional approach in which a weighted graph is built from hierarchical topologies of lidar data, thereby generating three-dimensional tree segments. The authors indicate an average detection accuracy of 98% for simple, regular tree structures and 85% for complex tree structures. The approach of Dai et al. [5] merges mean-shift clusters initially generated from different lidar-based feature spaces to individual trees using both the spatial domain and the multispectral domain. In the case of multispectral lidar data, this study reports a detection rate of 88%. Reitberger et al. [6] proposes a novel segmentation method based on normalized cut [7] that detects single trees using a graph cut approach. The study successfully shows a significant improvement of the single tree detection (up to 20%) in the lower forest layers.
Graph-cut based segmentation methods, such as normalized cut, partition the underlying weighted graph that is being finished with a dependence on a static stopping criterion. The stopping criterion controls the iterative segmentation and has no physical or object specific meaning. It must be either defined based on practical experience or optimized by means of a sensitivity analysis using reference data [8]. However, the stopping criterion is applied statically, without accounting for the actual local tree species; therefore over-and under-segmentation occurs when using the stopping criterion averaged over the reference data. Recently, Amiri et al. [9] improved the normalized cut based tree segmentation by replacing the static stopping criterion with an adaptive classifier that analyzes the point cloud of the cluster to be partitioned. However, this extension of the basic method only works for conifers and fails for broadleaf trees.
Optical imagery has become a standard source to discriminate tree species [10,11]. Recently, dense matching has emerged as a mature technique used for reconstructing objects from a series of highly overlapping images at the pixel level [12]. When applied in a forest area, the dense point cloud of a canopy surface can be used for tree species classification either at the tree level, or using an area-based approach [13]. Moreover, the combination of multispectral images with lidar advantageously enriches the limited radiometric range of lidar. Ullah et al. [14] demonstrated an improvement in the estimation of forest structural parameters at the stand and forest compartment levels using point clouds generated from aerial imagery.
Over the past decade, lidar has emerged as an excellent data source for classifying tree species. Several studies have demonstrated the feasibility of tree species classification using structural features, such as crown shapes, height distribution percentiles, and proportions of first/single returns, obtained from lidar point clouds [15][16][17][18]. The application of the newly introduced full waveform lidar systems has also improved accuracy by applying waveform features that use detailed backscattered pulse information, such as the intensity and pulse width [19,20]. Reitberger et al. [6] also demonstrated the advantages of full waveform lidar for tree species mapping in the Bavarian Forest National Park by utilizing the intensity and pulse width. For the same forest area, Yao et al. [21] showed that broadleaf trees and conifers could be classified with an overall accuracy of 78%. However, Shi et al. [22] found that the classification accuracy decreases by 30% if the detailed tree species mapping (six different tree species) is attempted for the same study area. The study in Hovi et al. [20] reported an overall accuracy of 75% for the classification of three main tree species in Finland using lidar waveform features.
The advent of high resolution lidar point clouds made possible the successful detection of individual standing dead trees located in areas of pest infestation. Yao et al. [21] was able to detect standing dead trees with crowns with an accuracy between 71% and 73%. The approach only used the lidar intensity (single wavelength 1550 nm) and geometric features such as the crown shape and point height distribution. Recently, Polewski [23] presented a method that combines single 3D tree segments with multispectral aerial imagery for mapping of standing dead trees with crowns. A two-class classification (dead tree and non-dead tree) led to an accuracy of around 88% using features generated from the covariance matrix of three image channels. The detection of standing dead trees without crowns (i.e., snags) was tackled in Polewski et al. [24]. The approach generates salient features by free shape contexts suitable to describe the single snags in a lidar point cloud. The study reports a classification accuracy between 71% to 73%. At present, these techniques for mapping dead trees have only been applied to small control plots.
The structure of a forest is one of the most important variables for the future development of forest stands [25] and has considerable impact on the presence of protected species [26]. For adequate management, the description of species composition, spatial structure, natural dynamics, and various tree characteristics is mandatory. These forest characteristics are also needed as explanatory variables for many research issues. Šumava National Park and Bavarian Forest National Park are collecting forest structure data in the field on a permanent research plot, whose total area is relatively small when compared with the size of the entire forest area [27]. So far, slightly different measurement methodologies have been applied in both parks, such as update periods and plot size, hence complicating forest data comparisons and cross-border mapping of forest structures. Lidar offers the unique opportunity to describe the forest structure of both neighboring parks in a consistent data set generated at the same time and based on identical methods.
In summary, tree segmentation based on normalized cut using a static stopping criterion generates over-and undersegmentation effects depending on the distribution of tree species. Therefore, it appears advantageous to apply a stopping criterion that is based on a tree species's information obtained from a preliminary classification of conifers and broadleaf trees. Moreover, the complete mapping of conifers, broadleaf trees, as well as standing dead trees, has not yet been successfully demonstrated in a large temperate forest area using lidar and multispectral imagery. Contrary to the experiments normally conducted in small test areas, the results of a large-scale experiment are statistically more relevant and more reliable. Finally, the area-wide mapping of individual trees-especially of dead trees and snags-across the border allows for a seamless representation of the ecosystem that is important for forest biodiversity conservation.
The main objectives and the novelties of the current study were (i) to optimise tree segmentation based on normalized cut using two different stopping criteria for conifers and broadleaf trees and (ii) to map conifers, broadleaf trees, standing dead trees, and snags in a large temperate forest 924 km 2 in size situated in Šumava National Park and Bavarian Forest National Park. Thereby, we verified the statistical performance of the methods in a large-scale experiment and provided spatial information about the distribution of trees usable for area-wide information on the abundance and distribution of key habitat structures [28].

Study Area
Established in the Bohemian Forest in 1991 and 1970, respectively, Šumava National Park (49 • 12 N, 13 • 30 E) and Bavarian Forest National Park (49 • 12 N, 12 • 58 E) protect 92,464 ha of mountain spruce forests. The forest cover more than 82% of this area, one of the largest complexes in central Europe. Both parks are part of the Natura 2000 network, which was established to protect the most endangered habitats and species in Europe, as defined in both the Habitats Directive (1992) and Birds Directive (1979). This forest area comprises natural and secondary habitats with numerous rare and protected plant and animal species of exceptional natural value of European-wide significance [27]. Although divided by the state border between Czech Republic and Germany, the forest nature is consistent for the area of both parks.
Both parks are dominated by Norway spruce (Picea abies) with European beech (Fagus sylvatica), silver fir (Abies alba) and larch (Larix). The parks also contain some other tree species, such as white birch (Betula pendula), sycamore maple (Acer pseudoplatanus), and common rowan (Sorbus aucuparia) [29]. Due to bark beetle infestation, large areas are now characterized by dead wood, comprising fallen dead trees, standing dead trees, and standing dead trees without crowns (=snags). Note that we did not focus on the detection of fallen dead trees. Moreover, we excluded dead trees located in windthrow areas. Figure 1 shows a map of the two parks and the reference plots (see Section 2.4.1).

Lidar Data
The airborne full waveform data were acquired in June 2017 (leaf-on condition) using a RIEGL LMSQ 680i instrument, which was carried by a helicopter at a flying altitude of 550 m and at a speed of 60 kts. Parallel flight strips were flown with side lap of 60%. Therefore, the resulting average point density was 55 points/m 2 . In total, the area of the lidar campaign was 943 km 2 . The lidar data were provided as LAS-files containing an amplitude value for each laser point. Table 1 summarizes the parameters of the RIEGL LMSQ 680i scanner and the lidar flight campaign.
The raw amplitude values I raw were corrected with regard to travelling distance s of the laser beam assuming a mean flight height s 0 = 400 m with The parameter n was calibrated using the lidar data of a calibration flight that was conducted on a near-by airfield. Following the procedure suggested by [6], we calculated for parameter n = 2.03.
Finally, the entire lidar set was geometrically checked in planimetry and height using enclosing polygons for flat buildings and the mean height of small horizontal areas. On average, the mean horizontal and vertical displacements were 5 cm and 6 cm, respectively.

Aerial Multispectral Data
For both parks, the multispectral aerial imagery was acquired on October 2017 and June 2017 in two separate aerial flight campaigns. In Šumava National Park, the fall foliage change had already started for most of the broadleaf trees (see Figure 2a). Therefore, the spectral appearance was similar for beeches and dead trees. Moreover, because of low solar altitude, the shadows of the trees were fairly long. In Bavarian Forest National Park, the color infrared imagery was optimal because the data were acquired in June 2017. Note that the acquisition time for the lidar campaign is almost the same. Figure 2a,b show typical areas containing conifers, broadleaf trees, and dead wood. The ground sample distance (GSD) was not identical for the two park areas and had values of 17 cm and 9.5 cm for the Šumava and Bavarian Forest National Parks, respectively. In total, 669 images were captured for Šumava and 2553 for Bavarian Forest. Additionally, Global Navigation Satellite System (GNSS) data and Inertial Navigation System (INS) data were also acquired to provide initial information about the exterior orientation. The mean above-ground flight heights were 3900 m and 2880 m for the Šumava and Bavarian Forest National Parks, respectively. The images had end-laps of 80% and side-laps of 60%. They contained four spectral bands (blue, green, red, and near-infrared). The aerotriangulation (AT) was calculated using the following input: the aerial images, the camera calibration model, and the image measurements for the control points. As expected, the sigma naught of the AT was excellent, with 0.22 GSD for Šumava and 0.3 GSD for Bavarian Forest. For the software packages, ISAT [30] and MATCH-AT [31] were used for Šumava National Park and Bavarian Forest National Park, respectively. The software packages OrthoMaster/OrthoVista [32] generated true orthophotos utilizing the digital surface model of the total forest area. Table 2 shows the details of the aerial flight campaigns and the subsequent photogrammetric processing of the aerial imagery.

Field Measurements
Ground truth data for sample plots were acquired by field measurements in four reference data set Bioklim CZ [33], HTO [34], Bioklim [33], and LAI Bioklim [35]. In both parks, tree height measurements were carried out using the "Vertex" III system [36] considering trees with a diameter at breast height (DBH) > 7 cm. In rectangular HTO plots, each stem position was precisely surveyed by means of tacheometry and double-checked with differential GPS (DGPS) using the Leica GPS System 500 instrument. The coordinate differences indicated an excellent standard deviation of 5 cm [34]. The centers of all remaining circular plots were measured by means of differential GNSS (DGNNS). In the case of Czech plots Bioklim CZ, the Leica ATX1230 GG instrument provided plot center locations, reporting a position accuracy between 3 cm and 96 cm. For the Bioklim and LAI plots, a Leica GS 14 Professional GNSS receiver was used [33]. The range of the internal accuracy estimation was between a few centimeters and several decimeters. In Czech plots Bioklim CZ, tree positions were collected with a laser range-finder and a magnetic compass IFER-MMS from Field-Map Technology (see Zenáhlíková et al. [27]). In Bavarian plots Bioklim and LAI, the distance was surveyed using a "Vertex" III system and a compass for the azimuth (see Bässler et al. [33]). Table 3 gives an overview of the tree species distribution in the four reference data sets.
The reference trees were further subdivided into three layers with respect to the top tree height htop in the plot. The parameter htop is defined as the average height of the 100 highest trees per ha [37]. The lower layer contains all trees below 50% of htop, the intermediate layer corresponds to all trees between 50% and 80% of htop, and the upper layer contains the remainder of the trees.

Reference Trees by Manual Labelling
Additional reference data were collected by visual inspection using polygons of segmented trees. In total, six different rectangular plots A1, A2, A3, B1, B2, and C1 (see Figure 1) were selected. These reference plots are dominated by broadleaf trees, conifers, dead trees and snags, thereby providing a varied composition of trees for classification. Table 4 gives an overview on the total number of trees manually labelled. We used an interactive tool that displays polygons of tree segments in the aerial image, as well as the segmented point cloud of tree segments. The point cloud can be rotated in 3D, thereby aiding the expert in verifying the class of investigated tree objects. In detail, we labelled samples for the classes 'conifer' and 'broadleaf', 'dead tree (with crown)', 'non-dead tree', 'snag', and 'non-snag'. The identification of living and dead conifers in both parks was fairly easy (see Figure 3a,e,g). The broadleaf trees in Bavarian Forest National Park could also be classified without problems (see Figure 3f). However, broadleaf trees in Šumava National Park were difficult to label because many were already at the leaf-off stage and were therefore visually similar to dead trees (see Figure 3b). For the same reason, the labelling of dead trees in Šumava National Park was also a fairly difficult task (see Figure 3c,d). Instead, single dead trees were in many cases perfectly silhouetted against the background in Bavarian Forest National Park (see Figure 3g). Some problems occurred when they were overlapped with nearby spruces (see Figure 3h). In both parks, snags could only be found using the point cloud of the tree segments because the snags were practically invisible in an aerial image (see Figure 3i,j).

Outline of the Method
In our approach, we classifed tree groups 'conifers' and 'broadleaf trees' as well as standing dead trees and snags using lidar and multispectral imagery. A fairly large feature set was generated comprising geometric features, intensity features, features for snags based on 3D shape contexts, and features generated from the multispectral images. Overfitting effects were reduced by automatically selecting prominent features. In our study, the following methodology was applied. We segmented the lidar point cloud into 3D segments representing single trees using the normalized cut algorithm [6,7] (see Section 2.6). The procedure recursively partitioned the lidar cloud into single tree segments until a predefined stopping criterion was achieved. The aim of the segmentation step was to provide tree segments that are used in the next steps to classify the segments with respect to the two tree groups, dead trees and snags. Figure 4 illustrates the entire procedure, which is explained in detail in the following subsections.   Figure 4. Overview of the approach for classification of conifers, broadleaf trees, standing dead trees, and snags.

Single Tree Detection
For single tree detection, we used the normalized cut segmentation which is a top-down method for segmenting objects over a discrete graph structure G = (V, E), with V as the node set, E as the edge set, and w ij (o i , o j ) as the symmetric, non-negative pairwise object similarity function.
..N is the similarity matrix representing the pairwise weighted interrelationship between N primitives O = {o i } i=1...N of a super-voxels, which is provided by a pre-segmentation with k-means. The normalized cut algorithm [7] recursively bisects the graph into disjoint K subgraphs representing clusters A and B by minimizing the objective function (=normalized nut) where Assoc(A) = ∑ u∈A,v∈V w ij (A, V) is defined as the volume of subgraph A and is equivalent to the sum of weights of all edges ending up in cluster A. Equation (2) maximizes the similarity within clusters and minimizes the similarity between the disjoint clusters A and B.
The weighting matrix W is created from a similarity function where d k (o i , o j ) 2 denote the M features used as similarity measures. In this study, we used the feature weights σ i for the values recommended in [6] (see also Section 2.10).
The recursive splitting of graph G into disjoint K subgraphs must normally be terminated by threshold value NCut max . This parameter has no physical meaning and can be defined based on practical experience or using a trial-and-error strategy. In this study, a sensitivity analysis using training data was performed to find an optimum value for the objects to be segmented. However, the optimal value for NCut max turned out to be different for broadleaf and coniferous trees. Thus, we modified the segmentation and used two different stopping values for NCut max , thereby making the segmentation dependent on the tree groups. In detail, we initially segmented the point cloud using the larger value of both NCut max parameters. In our case, this value was representative for conifers. Next, we applied the tree classification (see Section 2.8) to all tree segments. Third, the tree segments of the broadleaf trees were clustered with k-means clustering. Finally, the new tree segments potentially formed by broadleaf trees were newly segmented using the other NCut max parameter, which was the best for broadleaf trees.

Feature Extraction
In this study, for each tree segment, we calculated the following feature set from the lidar data and multispectral images. The feature set can be subdivided into four main groups.
(i) Geometric features: Feature set S g set includes (i) height-dependent variables at 10% intervals, (ii) density dependent variables at 10 intervals, and (iii) the two axis lengths of a paraboloid fitted to a tree crown (see Reitberger et al. [6]).
(ii) Intensity features: The lidar point cloud contained, for each laser point, an amplitude value that was calibrated in advance (see Section 2.1). Based on this laser point attribute, the mean amplitudes per tree segment and in 10 tree layers were calculated to form feature subset S I .
(iii) 3D shape contexts: For the snag classification, we followed the idea of Polewski et al. [24] and applied 3D shape contexts, which are constructed as cylinders with radius r seg and length l seg along the tree segments' axis (see Figure 3k,l). Features S 3DC are essentially histograms representing the spatial distribution of the points within the cylindrical volume.
(iv) Covariance matrix: For this feature subset, we projected the tree segments into the next aerial color infrared image using the exterior orientation. Within the resulting enclosing polygon, we calculated the 3 × 3 covariance matrix and the mean of the three channels. The six independent elements of the covariance matrix and the mean channels form the nine elements of feature set S cov .

Classification of Conifers and Broadleaf Trees
For classification of conifers and broadleaf trees, we used the feature sets S g (= geometric features), S I (= intensity feature), and S cov (= features from aerial images). We chose Random Forest (RF) [38] as a binary classifier because of its robust ability to estimate the class probability. Note that RF provides a probability for each classified object.

Classification of Standing Trees and Snags
To classify both types of dead trees based on features S cov and S 3 DC we applied a multinomial logistic regression (MLR). We used a regularized-generalized model, which was implemented in the glmnet function as an R source package. Similar to RF, MLR classifies each dead tree with a probability p dead for dead trees and p snag for snags. Note that we also tested RF and Support Vector Machines (SVM) as classifiers. However, MLR turned out to be the best option for this classification task.

Choice of Parameters
The 3D tree segmentation based on the normalized cut was controlled by parameters listed in Table 5. According to Yao et al. [39], we introduced the quadratic distances D 2 between super-voxels i and j as features. These two features were completed by parameter F max XY ij , which defined the maximum distance of a super-voxel pair i and j to a tree position provided by the watershed segmentation. The amplitude of the laser points processed in Section 2.2 was neglected because this feature has practically no impact on the quality of the segmentation. Therefore, the weighting matrix W can be specified as where R XY defines the vertical cylinder within which the weights w ij are calculated. The control parameters σ xy , σ z and σ F weight the corresponding features in Equations (3) and (4). The values used, which are listed in Table 5, follow the recommendations of Yao et al. [39].
To calculate the features S 3DSC of the snag detection (see Section 2.7), we used the control parameters listed in Table 6.

Evaluation and Classifier Training
We conducted four groups of experiments. First, we segmented single trees using the modified normalized cut segmentation. In this experiment, we showed that two different values NCut max can be successfully applied to segment trees in dependence on the tree classes "conifer" and "broadleaf tree". Second, we classified conifers and broadleaf trees using different feature sets. The third and fourth experiments focused on the classification of dead trees and snags, respectively. All four experiments utilized the reference data from the field measurements and the visual inspection (see Section 2.4).
In experiment #1, recall and precision quantities were used to evaluate the quality of the detected trees. The ratio of the reference tree numbers, which have at least one associated detected tree, to the total number of reference trees is called 'recall'. The number of detected trees that were successfully connected to a reference tree as a fraction of the total number of detected trees is defined as the 'precision'. Both recall and precision are calculated depending on stopping criterion NCut max . Following the strategy proposed by Reitberger et al. [6], we considered two trees to be matched trees, if (i) the distance between the segmented tree to the reference single tree is smaller than 60% of the mean tree distance within the sample plot and (ii) the height difference between the segmented tree and the reference tree is less than 20% of the top height of the plot. If a reference tree was connected with more than one segmented trees, the tree position with the shortest distance to the reference tree was selected. In cases where a segmented tree could not be combined with a reference tree, we considered this to be a 'false positive' segment.
For classification of conifers and broadleaf trees (=experiment #2), we combined reference trees from field measurements and visual inspection to train the classifier. As a classifier, we used the RF classifier from Section 2.8. The reference data were split into two sets for training and testing. The RF classifier was also validated with reference data provided only by field measurements. For dead tree classification (=experiment #3 and #4), we only used reference trees determined from visual inspection. In both cases, the classifier was logistic regression (see Section 2.9). For these two latter experiments, the reference samples were also subdivided into two data sets for training and testing. We used a five-fold cross-validation to obtain the overall classification accuracy, precision, and recall as a compromise between the computational efficiency and reducing the effects of randomness. All classifiers were evaluated with test data not being used for training. All results are presented with confusion matrices calculated from reference data and test data.

Single Tree Detection
A sensitivity analysis was conducted in the plots HTO, Bioklim, LAI, and Bioklim CZ to find the best value for parameter NCut max . First, we applied the same value for parameter NCut max to conifers and broadleaf trees. An analysis of Table 7 with Tables 8 and 9 let us conclude that data set HTO was the best. In general, the recall of data set combination Bioklim, LAI, and Bioklim CZ was around 20% worse when compared with HTO data set (Table 8). When we compared the data set Bioklim CZ (Table 9) with HTO data set (Table 7), we also found a significant discrepancy in the recall, especially in the upper layer and, in general, for the total mean. The precision of this data set was 15% better than HTO in total mean. In summary, data set HTO was the most reliable, also because of the high absolute accuracy of the tree positions (see also Section 2.4.1). Thus, parameter NCut max = 0.16 seemed to be the best compromise for controlling the segmentation for both conifers and broadleaf trees.
We further analyzed the segmentation with respect to conifers and broadleaf trees. Because we found that many broadleaf tree areas had a clear oversegmentation, we applied the new segmentation scheme discussed in Section 2.6. After varying parameter NCut max in a reasonable range, we found optimized values for this control parameter. Tables 10 and 11 demonstrate clearly that NCut max = 0.10 increases the precision by 13% at a small cost of deterioration in recall. Figure 5c-f show nicely the reduction of oversegmentation with parameter NCut max = 0.10 for the preclassified broadleaf trees.

Classification of Conifers and Broadleaf Trees
For the Czech and German parts of the forest areas, we trained two separate classifiers for classification of conifers and broadleaf trees because the aerial images were captured at various times and with different cameras. We used 3D geometry features S g describing crown shape and point distribution per tree and feature set S cov generated from the aerial images (see Section 2.7). Interestingly, the intensity features S I finally had to be ignored because, in some areas, the intensity distribution was not consistent and showed severe tiling effects (see Figure 6a,b). Tables 12 and 13 show the validation of the classifiers on test data taken from four reference areas. In general, the classification results were excellent. For Šumava National Park, the overall accuracy (OA) of 97.5% was better by 5-7%. Moreover, the confusion between conifers and broadleaf trees was significantly lower, and broadleaf trees were better classified in terms of recall and precision. Very likely, the fall foliage helped to discriminate among conifers and broadleaf trees in Šumava National Park for this accuracy level.

Classification of Standing Dead Trees
For both parks, we trained two separate classifiers to classify dead trees using the feature data set S cov because of the different acquisition times for the aerial imagery. Note that the training data set was automatically balanced. Therefore, the number of dead trees and living trees in the confusion matrices differed from the relevant numbers indicated for training and testing in Tables 14 and 15. Moreover, a class probability p dead = 90% was used to classify a dead tree.
For Bavarian Forest National Park, the OA of 91.7% calculated from the training data with five-fold cross validation was confirmed by the OA of 92.6% performed in the test area. Conversely, the OA of 82.8% for Šumava National Park was slightly reduced for the training data. Very likely, the classifier was influenced by the spectral similarity of the dead trees and the beeches. The OA of 100% for the test data was certainly too optimistic. Presumably, manually labelled trees could be classified easier.

Classification of Snags
In experiment #4, only one classifier was trained because the used feature set S 3DSC solely depends on lidar data, and the 3D structure of snags was assumed to be the same for both parks. Balancing of the training data and usage of class probability p snag were the same as for the classification of dead trees. The excellent OA of 90.9% in the training area is in the same order of magnitude as the OA calculated with the test data (see Table 16). Figure 7 shows examples for labelled snags that were classified as snags with p snag > 90%. However, the analysis of the test reference revealed a significant confusion between snags and living trees.

Main Goals of the Study
The main goals of the current study were to classify two tree classes (coniferous and deciduous), standing dead trees, and snags in a large forest area using lidar and multispectral imagery. We also showed a novel calibration of a control parameter that stops the iterative normalized cut segmentation.

Single Tree Detection
The experiments clearly showed that the segmentation accuracy depends on conifers and broadleaf trees. The best values for parameter NCut max , which controls the stopping of the iterative normalized cut method, were 0.10 for conifers and 0.18 for broadleaf trees. Reitberger et al. [6] recommended an average value of 0.16 for both tree groups. Instead, we used two different parameters for NCut max and could improve the segmentation quality in terms of precision by 13%. However, we had to rerun the segmentation for broadleaf trees again. Thus, the improvement in accuracy was gained with an increased computation time of roughly one-third. Recently, Amiri et al. [9] replaced control parameter NCut max with an adaptive stopping criterion that analyses the remaining point cloud of the segmented clusters and allows a classifier to determine the termination of bisecting of the clusters. However, this method only works for conifers and is not applicable for broadleaf tree.

Classification of Conifers and Broadleaf Trees
In general, the results of tree classification were excellent, with values for OA beyond 95% for both parks. The laser intensity, calculated as a mean value for the upper part of the tree crown, turned out as useless. Very likely, the calibration parameter for intensity normalization was not representative for the entire project area. Interestingly, the spectral information of CIR aerial imagery successfully replaced the laser intensity as the salient feature in the tree classification process.
The OA in Tables 12 and 13 calculated from different test data sets (Reference #1 and #2 for Bavarian Forest National Park; Reference #3 and #4 for Šumava National Park) also demonstrated the transferability of the classifiers. For Bavarian Forest National Park, we noticed a difference of 7.8%, whereas the OA for Šumava National Park was essentially the same for both test areas.

Classification of Dead Trees and Snags
In both parks, the classification of dead trees performed with excellent accuracy, better than 90% in terms of recall and precision. In comparison, the accuracy in Šumava National Park was slightly reduced because the fall foliage of broadleaf trees was spectrally similar to the crowns of the dead trees.
Finally, the training of the snag classifier showed a fairly good OA of 91%. Although the OA for the test data was even better with 96%, the recall and precision for snags was only 66% and 56%, respectively. Because of the smooth transition from dead spruces with crowns and without crowns (=snags), we could not find an optimal classifier for snags. A detailed analysis revealed that some false positives were dead trees with only a few branches. Thus, they could also be considered snags in the labelling process.

Comparison with Related Work
First, we contrasted our results for tree segmentation with the findings of Reitberger et al. [6], which also investigated a forest site in Bavarian Forest National Park. In general, our recall for the upper layer (89% for conifers and 84% for broadleaf trees) was in the same order of magnitude. However, for the middle layer, we obtained a better recall of 43% (mean for conifers and broadleaf trees) compared to 32% reported in [6]. For the lower layer, the mean recall of 14% corresponded well with 17% shown in [6]. Our total precision of 74% was worse than the precision of 90% in [6].
Second, we compared the averaged results of the tree segmentation for the HTO data set with four studies: see Strîmbu and Strîmbu [4], Li et al. [40], Holmgren and Lindberg [41] and Silva et al. [42] (see Table 17). Note that two study areas are dominated by conifer pine trees, and that the stem densities of three studies are by a factor of 1.4 to 3.2 lower. In Table 17, the results of our study are presented with regard to (i) coniferous plots and (ii) all plots (=mixed) for the upper layer (see also Table 10 and Table 11). Li et al. [40] demonstrated in a Sierra mixed conifer pine forest a detection rate of 86% and precision of 94%. This method segmented trees using a top-to-bottom region growing approach. Strîmbu and Strîmbu [4], using a three-dimensional approach in which a weighted graph was built from hierarchical topologies, reported for a pine dominated forest a recall of 84% and precision of 90%. Silva et al. [42], applying the method to an open canopy longleaf pine forest area, segmented trees with a detection rate of 82% and precision of 85%. Recently, Holmgren and Lindberg [41] demonstrated a new watershed-based tree detection method that used a tree crown density model trained from the laser returns and some arbitrarily selected reference trees. The evaluation of a managed hemi-boreal forest showed a recall of 85% and precision of 82%. Although the stem density of our study was significantly higher in three cases, the tree segmentation detected coniferous trees slightly better. However, the precision turned out to be somewhat worse. For all the reference plots of our study, representing a mixed temperate forest, the recall was also still better or similar. Clearly, the corresponding precision was lower by 7-16% because some of the deciduous trees, which are generally more difficult to segment, could not be found in the point cloud. Third, we focused on the quality of the classification of conifers and broadleaf trees and how it compared to the results published in Yao et al. [21]. In this other study, which was also conducted in Bavarian Forest National Park, the lidar point density was 25 points/m 2 . Several feature sets were investigated, including height-and density-dependent features, as well as intensity features. Very interestingly, our OA of 94.9% (see Table 12) obtained with feature sets S g (=geometric tree features; density-and height-dependent features) and S cov (=features from aerial images) was equivalent to OA = 94% reported by Yao et al. [21]. Very likely, the radiometric content of aerial imagery could successfully substitute for the lidar intensity in our case.
Forth, we compared our classification of dead tree and snags with other approaches obtained from the same forest site. Yao et al. [21] reported a classification accuracy between 71% to 73%, which was significantly worse than ours. The study in Polewski [23] reported an accuracy of around 88%, which is slightly worse than our findings. Although both of these other studies measured the reference data manually using orthophotos superimposed on the tree crown polygons, the combination of lidar data and multispectral optical imagery in a two-class classification made the classification of dead trees possible with high accuracy. Lastly, our snag detection was better by more than 10% when compared with the results presented in [24]. Again, we concluded that the increased point density of 55 points/m 2 better reflects the 3D structure of a snag.
Finally, we concluded that all the main findings of our large-scale experiment confirmed the results of several studies that were conducted in a temperate forest.

Practical Implementation
We present some interesting statistical numbers for this large-scale mapping of forest parameters. All the methods of Section 2.4.2 and related algorithms are coded in C++ and run under Windows (© Microsoft) and Ubuntu. The tree segmentation and the tree classification are parallelized, thereby taking advantage of any multi core PC systems. In total, we used eight PC workstations in parallel, each of them equipped with up to 22 cores and 128 GByte RAM. Therefore, we split up the forest areas into 6 km × 6 km macro blocks. Each workstation processed one or two super macro blocks comprising 2 × 2 macro blocks. In total, we processed 27.321.962 single trees on a forest area 924 km 2 in size. Table 18 gives an overview of the number of tree types classified with respect to conifers, broadleaf trees, dead trees, and snags. The time critical process was the tree segmentation due to the time-consuming solving of the inherent general eigenvalue problem. Our most powerful workstation, with 22 cores, 2.2 GHz performance, and 128 GByte RAM, processed the tree segmentation and tree classification at 25 sec/ha and 0.38 sec/ha, respectively (see Table 19). By contrast, the processing times of the other two processes are significantly higher because the code is not parallelized.  We also have to address the time and effort needed for manual interactions to label additional tree samples, to train the classifiers, and to check the results of the individual processes. Interestingly, we had to add the reference areas A1, A2, A3, B1, B2, and C1 because the other reference areas from the original field measurements did not fully represent the forest characteristics and were somewhat imprecise in parts. This entire project could be finalized by one single person within 11 months. This included one month for data preprocessing (quality check of the lidar data, intensity calibration, and structuring the data into a block structure (see Section 2.2) and around three months for the manual editing part. The remaining seven months were dedicated to the processing of the remote sensing data. Note that we had to process the segmentation twice on roughly half of the entire forest because of another NCut max value useful for broadleaf trees. Moreover, the retraining of relevant classifiers instigated a second run for the classification of conifers, broadleaf trees, and dead trees as well. In addition, half the workstations were only equipped with processors using eight cores.
The tree segmentation and the classification of conifers and broadleaf trees were processed with the TreeFinder software package [43]. The detection of dead trees and snags was performed with an in-house software.
Finally, we show in Figure 8 the mapping of conifers, broadleaf trees, standing dead trees, and snags for a small part of the Bavarian Forest National Park.

Conclusions
In summary, we were able to demonstrate that tree segmentation based on normalized cut can be successfully performed with a stopping criterion specifically calibrated for tree groups 'conifers' and 'broadleaf trees', thereby significantly reducing the effect of under-and oversegmentation. In our case, the precision for tree segments of broadleaf trees could be improved by 13%. However, this concept to calibrate parameter NCut max regarding the tree groups bears the drawback that the segmentation needs to be performed manifold in the case of multiple tree species. We further showed the successful mapping of conifers, broadleaf trees, and standing dead trees (with and without crowns) on a large forest 924 km 2 in size. Conifers and broadleaf trees could be classified in both parks with an accuracy better than 90%. Furthermore, the classification of standing dead trees was also possible with a high accuracy of more than 90%. The large consistent data set can be used by both parks for future preparation of cross-border management plans and studies on biodiversity [44].
Author Contributions: P.K., C.S. and M.H. conceived and designed the experiments; A.S. and P.K. performed the experiments; all the authors analyzed the data and results; P.K. wrote the paper; all the authors reviewed and edited the paper. All authors have read and agreed to the published version of the manuscript.