This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (

Species information is a key component of any forest inventory. However, when performing forest inventory from aerial scanning Lidar data, species classification can be very difficult. We investigated changes in classification accuracy while identifying five individual tree species (Douglas-fir, western redcedar, bigleaf maple, red alder, and black cottonwood) in the Pacific Northwest United States using two data sets: discrete point Lidar data alone and discrete point data in combination with waveform Lidar data. Waveform information included variables which summarize the frequency domain representation of all waveforms crossing individual trees. Discrete point data alone provided 79.2 percent overall accuracy (kappa = 0.74) for all 5 species and up to 97.8 percent (kappa = 0.96) when comparing individual pairs of these 5 species. Incorporating waveform information improved the overall accuracy to 85.4 percent (kappa = 0.817) for five species, and in several two-species comparisons. Improvements were most notable in comparing the two conifer species and in comparing two of the three hardwood species.

Information about individual tree species can be extremely beneficial when estimating many forest resource values, such as timber value, habitat quality, or susceptibility to loss. Unfortunately, detection of individual tree species using remote sensing (RS) data has proven to be a difficult task to accomplish. The species of a tree is only one of several factors that affect the realized shape and color of an individual tree crown. Other factors such as terrain, environment, competition, and genetic variation have large influences as well. As a result, for most variables that one can measure from RS data, there is significant distributional overlap between species, for instance (Figure 4 in [

Knowledge of the probable species of individual crown regions identified in the data would enable us to stratify model estimates by species. This will most likely increase precision in any stand-level estimates of interest. With this goal in mind, researchers have continuously tested new forms of RS data seeking improvements in detecting stand- and individual tree-level species information. As RS technology and computer algorithms have improved, so have the classification results achieved.

With the advent of scanning Lidar, many aspects of a forest inventory can now be accomplished using the aerial form of this data [

Many authors have investigated the potential of Lidar data for species classification. Sometimes Lidar is mixed with additional data sources, such as color or multi-spectral imagery [

Crown density describes the leaf and branch size and arrangement and is typically measured using the proportions of the returns hitting vegetation versus those hitting the ground [

In the last half-decade, a newer format of Lidar information, commonly referred to as “waveform” or “fullwave” Lidar, has slowly increased in availability. In contrast to the more common discrete point Lidar systems, this newer Lidar system takes advantage of increased processor speeds and data storage capacity by digitally sampling at a high rate the return signal received at the sensor. The result mimics the appearance of a wave, and an example of such a waveform can be seen in

While a few authors have looked to waveform data for improving classification accuracy the first step has always been to decompose the waveforms into discrete peaks. One advantage of this technique is that information about peak shape can be preserved. The shapes of these peaks have successfully been used in distinguishing vegetation from other surfaces [

In a previous paper [

Waveform data were obtained by Terrapoint, USA during the evening of 7 August 2008 over the University of Washington Arboretum in the city of Seattle, Washington. Sensor altitude above canopy surface ranged from 145 to 412 m with mean distance of 310 m. Scan angle varied from −30 to 30 degrees from zenith. Pulse frequency was 133 kHz. The majority of the arboretum was covered in one loop in the North-South direction. An area map of the Arboretum with the flight line plotted is given in

Our field data were collected in a slightly different manner as one would collect information if an inventory was required. We segmented tree crowns from the Lidar data prior to visiting the field so that we could verify that each tree matches its associated data segment. Several trees of five species were collected in this manner to ensure that our data were as clean as possible. This segmentation and field data collection took part in three steps:

All waveforms were deconvolved and then decomposed into individual peaks using a simple peak-detection algorithm. This point information was indexed into a voxel array structure.

A segmentation algorithm was used on the voxel array data to map the volume of space occupied by individual tree crowns into clusters of voxels.

Outlines of these clusters were used to locate the trees on the ground and identify the species.

The light pulse emitted from the laser is not instantaneous, but rather it increases to a peak and decreases again over a given period of time. The energy reflected by an object that is received by the sensor is then spread over the same time period. Due to this spread, adjacent targets falling within the path of the laser beam are more likely to appear as one peak in the waveform. A point-spread function

If the known point spread function is an odd _{p}_{i}_{p}^{(}^{l}^{)} must be less than 0.01.

We used a point spread function that spans 9 time steps (about 9 nanoseconds in duration), derived from a binomial distribution, based on the similarity between this distribution and the Gaussian shape typically attributed to Lidar pulses. Given that

A simple peak finding algorithm was performed on the waveform data after deconvolution with the Richardson–Lucy algorithm. The range at maximum for each peak found within the waveforms was used to compute an x, y, z position. Two additional pieces of information were kept for each peak. First was the total energy of the peak, or the sum of the intensity values for all waveform samples occurring during the defined peak. Second, we recorded the total range duration of the peak. The peak-finding algorithm had the following steps:

Decide on _{min}

Set _{0} =0

_{2} − _{1}. If ^{2} _{1} = _{1} =0.

Set

For

_{i}_{i−}_{1}. If ^{2} _{i}_{i}

If _{i}_{i}_{i−}_{1}

_{i}

else if _{min}

Record

else

This process resulted in a fairly heavy discrete point dataset, with an average of 104 points per square meter within tree vegetation. An visual example of this data is presented in

In order to obtain three-dimensional crown information about each tree crown, we created a voxel-based segmentation algorithm. Under this three-dimensional region growing algorithm, individual layers of the voxel array are read one at a time starting with the topmost layer. Individual voxels from each layer are added to new or existing voxel-clusters depending on their distance from these existing clusters. The ability of a cluster to incorporate a new voxel depends on current number of member voxels, vertical center of mass of these member voxels, as well as distance from the new voxel.

Create an empty list of clusters.

Create an array,

Create a second array,

If any layers of interest remain, read in the next layer,

Check if cell (

Check grid

Check for neighboring clusters on grid

Compute the cluster’s radius as
_{min}_{i}_{x}_{y}_{i}_{i}_{L}_{max}_{min}_{L}_{min}

Using the radius derived with _{i}_{i}_{i}_{i}_{H}_{i}

If any of the _{i}_{min}

Go through the preliminary clusters and merge a cluster into one of its neighboring clusters if it and the neighbor meet the following conditions:

The cluster is not too big (_{i}

The distance from cluster top to cluster centroid is less than 5 m, cluster is not nearly vertical (unit direction vector from top to centroid has z-component less than 0.94), or cluster bottom is below 2 m.

The cluster is smaller than the neighbor (has more total voxels than the neighboring cluster).

The cluster and the neighbor have a large enough interface (at least 60 percent of the horizontal rectangle envelope containing the cluster is shared with the envelope of the neighbor).

If the cluster has more than 3 layers and is larger than 50 voxels, the direction vector passing from cluster top through cluster centroid crosses through at least 4 voxels of neighbor (voxels must be within 2 mo f vector).

Go through the remaining clusters and delete any cluster that meets any of the following conditions:

The cluster is too small (contains less than 50 voxels).

The cluster is too flat (ratio of cluster height to the average of cluster width in rows and cluster width in columns is less than 0.8).

The cluster is too short (cluster top is less than 5 meters high).

We investigated the feasibility of the alternative voxel-based segmentation algorithm proposed in [

Using the voxel clusters produced in the last step, we created a GIS layer containing the two-dimensional outlines of each crown. The crown outline data were placed on a field computer with a built-in GPS receiver. Current position in the field was used to match voxel cluster outlines to the specimens of individual trees of five native species: Douglas-fir (

Clusters which contained parts of multiple trees, as well as those that contained only part of a single tree, could be identified in the field. These clusters were split along vertical planes or combined as necessary back at the office. This work was done in March of 2011, during which hardwoods were still in a leaf-off condition. To avoid scan angles too far from nadir, we stayed within 60 m of the flight line. In doing so, we were able to identify 22 to 29 individuals of each species, totaling 130 trees. Most conifers were large enough that crown segmentation was clear. However, we had some difficulty separating crowns of hardwood species growing in close proximity. Because we desired certainty that only a single tree crown is represented by a cluster, we skipped a small number of trees which could not be accurately deciphered.

Using the voxel cluster representing each of the trees identified in the field data, we were able to extract the waveforms that cross each tree from indexed waveform data. First, we found the set of voxels, _{top}_{top}_{top}

Additionally, using the voxel-array indexing, all peaks from the decomposed waveforms falling within the voxels of a cluster could be very quickly identified. We retained all information about all peaks contained within the voxel cluster of interest.

The discrete Fourier transform decomposes a time series of length _{j}

One can compute the phase shift Φ_{j}_{j}_{j}_{j}_{0}. This horizontal line acts as an intercept term in the model, and _{0} will equal the mean sample value in the original waveform.

The

Additionally, the
_{J}_{j}

The median, _{j}_{j}_{0} to _{30} and _{0} to _{30}, for each tree.

The peaks extracted from the waveforms are equivalent to three-dimensional points, such as those in a discrete point Lidar dataset. Several variables were computed from the collective properties of these points. In the past, many such variables have been proposed, and these can be classified into one of two categories: point arrangement statistics and intensity statistics. The former will yield information about crown shape, while the latter gives information about the ability of a tree’s foliage to reflect near-infrared light.

In order to obtain information about both crown shape and reflective properties, we used both the point representation and the voxel cluster representation of each tree. Most of the variables were chosen to mimic those presented in previous studies. A few were created in the hope of obtaining information similar to that provided by the Fourier transformation of waveform data. All of these point-derived variables can be theoretically computed from a modern discrete point Lidar dataset with at least three returns per pulse as well as recorded intensity data. For each tree in the dataset we computed the following variables for use as predictors in the test classifications:
_{25}_{50}_{75}_{90}

These four variables are estimates of the 25th, 50th, 75th and 90th percentiles of the relative height (point height relative to maximum point height) distribution.

_{1}

_{2}

_{3}

These three variables are the mean intensities of the first, second, and third peaks recorded for each pulse.

_{12}

_{13}

_{23}

These three variables are the mean Euclidean distances from first to second, first to third, and second to third returns, respectively, across all pulses.

We collected all distances between two consecutive peaks, regardless of position in the waveform. This variable is the estimated rate parameter of an exponential distribution, with form ^{−λx}

_{top}

For each tree, we reserved the set of member voxels which contain the highest recorded peaks in the vertical region projected above their respective row and column. This set,

_{area}

We also reserved the set,

_{n}

_{1}

_{nt,}

_{nb}

Using only the layer, row, and column address of member voxels of a tree, one can easily compute which neighbors a given voxel has. These three variables are the proportions of voxel members of the voxel cluster containing only one neighbor voxel, only a bottom neighbor, and only a top neighbor, respectively. Re-indexing the points to a 0.5 m

_{a}

_{b}

A function, given in

Many of the above variables describe crown shape and should be almost entirely unrelated to the information available from the Fourier transforms of the original waveforms. These include the height percentiles (_{25}_{50}_{75}_{90}), the voxel neighbor statistics (_{n}_{1}_{nt,}_{nb}_{a}_{b}_{area}

Conversely, the remaining variables not mentioned above may be correlated with the Fourier transform variables. The Fourier transform statistics indirectly provide quantifications of both the propensity of samples at different distances apart to be part of peaks and the scale differences of these peaks. If variables exist that can alternatively describe these traits, they might act as surrogates for the information available in the Fourier transform variables.

The remainder of the listed point-based variables were intended to provide this surrogate information. A high value of _{top}_{1}_{2}_{3}) and the distance statistics (_{12}_{13}_{23} and

The Fourier variables computed from the waveforms have very high dimension. Because of the high correlation among the amplitudes for all frequencies, it is very likely that the majority of this information could be described in fewer dimensions. Therefore, we ran separate principal component analyses on both the median and the IQR statistics for frequencies 1 though 31. This was done with the
_{m}_{1} to _{m}_{5}) as well as the first five components of the IQRs (_{q}_{1} to _{q}_{5}) were then used as predictor variables in the classifications. As mentioned previously, the amplitude for frequency 0 on an individual waveform is the mean waveform sample intensity. The median and the IQR of this amplitude value were believed to be especially beneficial, and were therefore kept as predictors and not included in the principal component reductions. This procedure reduced the dimensionality of the Fourier transform variables from 62 to 12.

We used the R function
_{m}_{1}_{m}_{5}, _{q}_{1}_{q}_{5}, _{0}, and _{0}) and the second set, _{1}_{2}_{3}, _{12}_{13}_{23},_{top}_{i}_{i}_{i}_{i}_{i}_{i}

We were not only interested in the overall performance of the variable groups, but also for which species comparisons of the individual variable groups performed best. After comparing the predictions from several routines, including linear discriminant analysis, classification trees, and the neural-network approach, support vector machine (SVM) classification performed the best overall. SVM is typically used as a kernel-based algorithm, in which a linear algorithm is applied to a non-linear mapping

We used the

We tested several predictor groups, individually and in combination, for the classifications. These groupings were: (_{25}_{50}_{75}_{90}; (_{1}_{2}_{3}; (_{12}_{13}_{23}_{top}_{area}_{n}_{1}_{nt}_{nb}_{a}_{b}_{m}_{1}_{m}_{5}; (_{q}_{1}_{q}_{5}; (_{0} and _{0}; (

We also applied the SVM on seven different classifications for each predictor group. This was to examine the species differences that were most sensitive to each predictor group. This information can help to better understand how the Fourier information either does or does not improve classification. The seven classifications were: (1) all species; (2) only hardwood species; (3) all species remapped to either CO or HW; (4) BC and BM only; (5) BC and RA only; (6) BM and RA only; and (7) DF and RC only.

For each classification, five-fold cross validation was used to test the performance of each predictor group. The trees in each species (or each growth form) were randomly split into five groups of similar size, and these species groups were combined into five data groupings. The species predictions for each grouping were performed by a decision rule based on the other four groupings combined. In this way no trees fall in a training set and validation set at the same time. Overall accuracy for each predictor group and classification was computed as the number of correctly predicted trees divided by the total number of trees.

For the classification of all five species, we performed the exact test by Liddell [

The first five principal components of the Fourier median variables (_{1} to _{30}) had standard deviations of 5.11, 1.65, 0.68, 0.51, and 0.31 respectively. The loadings of the first three of these components are displayed in graphical form in

the mean of the amplitudes over all frequencies

a comparison of the lower half of the frequencies against the higher half

the middle frequencies against the combined low and high frequencies.

Component 1 is likely a measure of the total pulse energy reflected by the target at the sensor. In general, the lower frequencies in the transformed waveforms have amplitudes several orders of magnitude larger than the amplitudes of the higher frequencies. After centering and scaling the data for each frequency, which removes this imbalance, the influence of each frequency becomes more equal. Components 2 and 3 measure the influence of the different groups of frequencies relative to the other groups.

For the IQRs, the standard deviations were 4.54, 2.72, 0.82, 0.67, and 0.40 respectively. _{1} to _{30}). A similar pattern to what appears in

The first two pairs of canonical variates were highly correlated, with correlations of 0.98 and 0.90 for (_{1}_{1}) and (_{2}_{2}) respectively. The following six pairs had correlations of 0.50 or lower. The high amount of correlation between the first two pairs demonstrated that there is some overlap of information among the two datasets. In other words, a portion of information from the Fourier transforms of the waveforms can be obtained from patterns from the discrete points extracted from the waveforms.

Coefficients from the first two rotations are shown in _{1}, _{2}, _{1}, and _{2}. Given the coefficients and means given in the table, values of _{1} are most influenced by the variables _{m}_{1}, _{0} and _{0}. These variables are all related to the amount of energy received by the sensor in the Lidar instrument. Similarly, the intensity means, _{1} to _{3}, show strong influence in both _{1} and _{2}. This result is nearly as visible by just looking at the correlation between some of these variables alone. In fact, _{1} shares a correlation of 0.85 with _{m}_{1} when the two variables are compared directly. Surface point density, _{top}_{1} and _{2}, suggesting that this information may not be obtainable from the waveform Fourier transformations directly. On the other hand, _{12}, _{13,} and _{23} do play a significant part in _{1} and _{2}, suggesting that some of this information overlaps between the two variable sets.

The combined variables from the discrete point and waveform datasets worked well for the classification of the five species. An overall accuracy of over 85 percent was achieved, with 111 out of 130 trees correctly classified.

Of the point-derived variables, no single group seemed to perform best in all situations. In fact, each individual group seemed to have species for which it was highly important. The relative height percentiles in group

Conversely, the Fourier transformation variable groups had a lot less variation in performance across the species groups. The median variables in group

For five species the overall accuracy of just over 85 percent achieved by the combination of point-derived and Fourier transformation variables compared favorably to similar research on several species [

Because of the large number of predictor variables, dimensionality was a possible reason for concern. The SVM function is a good choice for this study because it is able to handle a large number of dimensions in the predictor set. However, one negative aspect of the SVM function is that it can require some fine-tuning for maximal performance for a given data set. Performing such a customization for several predictor groups and several species comparisons would have both allowed too much bias and taken far too much time. This lack of tuning likely results in strange predictive behavior. One example of such behavior is the reduction in accuracy from 96.4 to 91.1 percent comparing the classifications of the two conifers using variables in group

We reduced the dimensionality of the two sets of Fourier transformation variables, medians and IQR, to six variables each using principal component analysis. Despite the heavy reduction, the 72 percent accuracy achieved by all twelve of the Fourier transformation variables in the classification of the three hardwoods nearly matched the 75 percent accuracy reported previously in Vaughn

The strongest individual predictor group for the five species classification was group

The group containing all waveform information,

The point-derived variable group,

The point height distribution statistics in group

In _{12}, _{13}, and _{13}, are smaller for BM than for all other species. As the name suggests, bigleaf maple has very large leaves that may be larger than even the pulse footprint. Each leaf hit is then very likely to record a noticeable peak in the return signal. This might result in more detectable peaks close to the crown surface. The crown surface model parameters in group _{a}

The voxel-based neighbor statistics in group _{n}_{1} for these two species. Perhaps larger cottonwoods and maples are more prone to lone branches that would lead to a larger number of one-neighbor cells. Including height as a predictor might account for this difference. Ørka

Few authors have investigated Lidar-derived crown texture directly as a predictor variable. Vauhkonen

Our intentional attempt to create variables from the discrete point data that aliased the information available in the Fourier transformations was not successful. These variables, such as those in groups

The trees measured in this study, most notably the conifers, are mostly open grown. The results as they stand would not directly translate to a high density commercial forest. With the increased density a smaller portion of each tree’s crown would be uniquely identifiable. Because the waveform information used in this study only contains one-dimensional information about position from wave start, the density should not greatly affect the results reported here. However, many of the spatial variables we used, such as those in groups

The prevalence of aerial Lidar for forest inventory information has increased over the last decade, and discrete point Lidar data has already shown much promise in distinguishing individual tree species [

For classification with discrete point data only, we introduced several new variables. In the five species classification, two groups of these new variables performed second and third best of all the groups. In each of the small classifications of two or three species or species groups, one of the groups of new variables performs at least second best. Note that the computation of each of these variables was very simple using a voxel representation of each tree. To provide such a representation, we introduced a new segmentation algorithm that can be easily adapted to local crown properties.

Perhaps of most interest, we discovered that summary information derived from entire waveforms provided predictive power above and beyond that of the discrete point data alone. This addition raised the overall accuracy to 85.4 percent (kappa = 0.817). The waveform information was important for separating Douglas-fir from western hemlock, increasing accuracy 3.5 percent (kappa increase = 0.072), and bigleaf maple from red alder, increasing accuracy 4.0 percent (kappa increase = 0.082). For other species comparisons, waveform information provided no gain in accuracy.

Other researchers have found that decomposing airborne waveform Lidar data into discrete points is useful for classification purposes [

The authors sincerely thank Terrapoint, USA for the provision of waveform Lidar data that is otherwise difficult to obtain. Thanks to the Precision Forestry Cooperative at the University of Washington, and to the Corkery Family Trust for funding portions of this work. David Briggs and Jim Flewelling provided very attentive and helpful feedback on earlier drafts of this paper. Their help is much appreciated.

An example waveform and probable associated discrete return points. The 120 waveform samples are shown as circles and a spline fit to these data appears as a solid line. A peak detector might detect two peaks at about 338 and 347 meters and return the intensity value when the peak is detected as shown with exes. Without knowledge of future sample values, real time peak detection algorithms usually produce a slight lag in peak location.

A map of the University of Washington Arboretum in the city of Seattle. The Arboretum boundary is shown as a red line. The helicopter flight path is plotted as a blue line and the associated Lidar coverage area is blue-tinted.

Three-dimensional plot of discrete point Lidar data extracted from waveform Lidar data. This view is looking northwest at the area outlined with a tan colored dotted line in

The combination of all component waves from the discrete Fourier transform, given in

_{area}_{area}_{a}_{b}

Loadings of the first three principal components of the Fourier median variables (_{1} to _{30}).

Loadings of the first three principal components of the Fourier interquartile range variables (_{1} to _{30}).

Default values for user-defined parameters of the crown segmentation algorithm.

_{min} |
2.00 |

_{max} |
1.50 |

_{min} |
1.00 |

4.6 | |

_{L} |
3.00 |

_{O} |
7.00 |

3.00 | |

8.00 | |

_{min} |
1.00 |

Tree height statistics by species of the trees contained in the training data.

Species | Count | Min. | 25th Pct. | Median | 75th Pct. | Max. |
---|---|---|---|---|---|---|

| ||||||

(m) | (m) | (m) | (m) | (m) | ||

BC | 24 | 30.52 | 36.16 | 37.67 | 38.88 | 42.76 |

BM | 22 | 25.78 | 26.90 | 28.54 | 31.34 | 35.84 |

DF | 29 | 25.03 | 31.21 | 35.35 | 37.70 | 40.89 |

RA | 28 | 16.23 | 22.31 | 25.23 | 29.30 | 35.53 |

RC | 27 | 24.67 | 29.30 | 30.66 | 32.92 | 38.87 |

Coefficients from the canonical correlation procedure for the first two canonical variates of both datasets. Mean and standard deviation of all variables are included for reference.

Coefficient | Coefficient | ||||||||
---|---|---|---|---|---|---|---|---|---|

Var. | Mean | S.D. | _{1} |
_{2} |
Var. | Mean | S.D. | _{1} |
_{2} |

_{m}_{1} |
0.0 | 5.11 | 0.013 | −0.035 | _{1} |
95.8 | 17.15 | 0.005 | −0.002 |

_{m}_{2} |
0.0 | 1.65 | −0.008 | 0.013 | _{2} |
34.5 | 6.38 | 0.005 | 0.002 |

_{m}_{3} |
0.0 | 0.68 | 0.000 | −0.017 | _{3} |
14.5 | 4.00 | 0.004 | 0.017 |

_{m}_{4} |
0.0 | 0.51 | 0.001 | −0.090 | _{12} |
1.4 | 0.08 | −0.146 | 0.169 |

_{m}_{5} |
0.0 | 0.31 | −0.014 | 0.046 | _{13} |
2.6 | 0.15 | 0.151 | −0.309 |

_{q}_{1} |
0.0 | 4.54 | −0.007 | 0.009 | _{23} |
1.3 | 0.09 | −0.167 | −0.021 |

_{q}_{2} |
0.0 | 2.72 | −0.001 | 0.009 | 0.3 | 0.03 | −0.119 | 0.039 | |

_{q}_{3} |
0.0 | 0.82 | 0.001 | −0.000 | _{top} |
0.2 | 0.07 | −0.049 | −0.044 |

_{q}_{4} |
0.0 | 0.62 | 0.000 | −0.005 | |||||

_{q}_{5} |
0.0 | 0.40 | 0.014 | −0.040 | |||||

_{0} |
11.6 | 1.60 | 0.036 | 0.100 | |||||

_{0} |
3.0 | 0.64 | 0.042 | −0.036 |

Confusion matrix for the classification of all five species using all available predictor variables from both the discrete point and waveform data.

Predicted | Producer Accuracy | |||||
---|---|---|---|---|---|---|

Species | BC | BM | DF | RA | RC | |

BC | 22 | 0 | 0 | 1 | 1 | 91.7 |

BM | 1 | 19 | 0 | 1 | 1 | 86.4 |

DF | 1 | 1 | 26 | 1 | 0 | 89.7 |

RA | 1 | 0 | 2 | 22 | 3 | 78.6 |

RC | 1 | 0 | 2 | 2 | 22 | 81.5 |

| ||||||

User Accuracy | 84.6 | 95.0 | 86.7 | 81.5 | 81.5 |

Overall accuracy,

Overall percent classification accuracy results of the support vector machine applied with a five-fold cross validation to different predictor variable groups and species groups.

Pred. Group | Species Classification Group | ||||||
---|---|---|---|---|---|---|---|

CO | BC | BC | BM | DF | |||

All | HW | HW | BM | RA | RA | RC | |

| |||||||

(%) | (%) | (%) | (%) | (%) | (%) | (%) | |

_{25}_{50}_{75}_{90} |
33.1 | 54.1 | 59.4 | 63.0 | 80.8 | 72.0 | 73.2 |

_{1}_{2}_{3} |
53.1 | 66.2 | 65.4 | 80.4 | 65.4 | 70.0 | 96.4 |

_{12}_{13}_{23} |
40.0 | 60.8 | 75.9 | 84.8 | 65.4 | 84.0 | 51.8 |

_{top}_{area} |
51.5 | 63.5 | 65.4 | 80.4 | 63.5 | 76.0 | 83.9 |

_{n}_{1}_{nt}_{nb} |
46.9 | 52.7 | 78.9 | 58.7 | 75.0 | 68.0 | 71.4 |

_{a}_{b} |
38.5 | 55.4 | 77.4 | 87.0 | 50.0 | 70.0 | 57.1 |

79.2 | 87.8 | 85.0 | 97.8 | 94.2 | 88.0 | 91.1 | |

_{m}_{1}_{m}_{5} |
57.7 | 59.5 | 67.7 | 69.6 | 78.8 | 82.0 | 89.3 |

_{q}_{1}_{q}_{5} |
49.2 | 50.0 | 67.7 | 73.9 | 63.5 | 72.0 | 91.1 |

_{0}_{0} |
46.9 | 52.7 | 75.2 | 78.3 | 69.2 | 64.0 | 80.4 |

66.2 | 71.6 | 75.9 | 82.6 | 88.5 | 84.0 | 92.9 | |

85.4 | 90.5 | 86.5 | 97.8 | 94.2 | 92.0 | 94.6 |

Kappa statistics of the support vector machine applied with a five-fold cross validation to different predictor variable groups and species groups.

Pred. Group | Species Classification Group | ||||||
---|---|---|---|---|---|---|---|

CO | BC | BC | BM | DF | |||

All | HW | HW | BM | RA | RA | RC | |

_{25}_{50}_{75}_{90} |
0.158 | 0.298 | 0.163 | 0.244 | 0.615 | 0.426 | 0.464 |

_{1}_{2}_{3} |
0.411 | 0.493 | 0.309 | 0.604 | 0.312 | 0.400 | 0.928 |

_{12}_{13}_{23} |
0.245 | 0.405 | 0.514 | 0.692 | 0.295 | 0.678 | 0.031 |

_{top}_{area} |
0.391 | 0.444 | 0.287 | 0.604 | 0.254 | 0.508 | 0.679 |

_{n}_{1}_{nt}_{nb} |
0.332 | 0.279 | 0.576 | 0.183 | 0.499 | 0.324 | 0.434 |

_{a}_{b} |
0.226 | 0.328 | 0.538 | 0.736 | 0.000 | 0.370 | 0.133 |

0.740 | 0.817 | 0.700 | 0.956 | 0.884 | 0.756 | 0.821 | |

_{m}_{1}_{m}_{5} |
0.468 | 0.388 | 0.325 | 0.385 | 0.578 | 0.629 | 0.785 |

_{q}_{1}_{q}_{5} |
0.362 | 0.246 | 0.335 | 0.469 | 0.271 | 0.421 | 0.821 |

_{0}_{0} |
0.331 | 0.284 | 0.493 | 0.561 | 0.377 | 0.255 | 0.608 |

0.575 | 0.570 | 0.508 | 0.650 | 0.766 | 0.672 | 0.857 | |

0.817 | 0.857 | 0.727 | 0.956 | 0.884 | 0.838 | 0.893 |