Next Article in Journal
Distribution of Artisanal and Small-Scale Gold Mining in the Tapajós River Basin (Brazilian Amazon) over the Past 40 Years and Relationship with Water Siltation
Next Article in Special Issue
A Sparsity-Based Regularization Approach for Deconvolution of Full-Waveform Airborne Lidar Data
Previous Article in Journal
Diagnosing Horizontal and Inter-Channel Observation Error Correlations for SEVIRI Observations Using Observation-Minus-Background and Observation-Minus-Analysis Statistics
Previous Article in Special Issue
An Easy-to-Use Airborne LiDAR Data Filtering Method Based on Cloth Simulation
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Extracting Canopy Surface Texture from Airborne Laser Scanning Data for the Supervised and Unsupervised Prediction of Area-Based Forest Characteristics

1
Department of Forest Sciences, University of Helsinki, P.O. Box 27, FI-00014 Helsinki, Finland
2
School of Forest Sciences, University of Eastern Finland, P.O. Box 111, FI-80101 Joensuu, Finland
*
Author to whom correspondence should be addressed.
Remote Sens. 2016, 8(7), 582; https://doi.org/10.3390/rs8070582
Submission received: 31 May 2016 / Revised: 1 July 2016 / Accepted: 5 July 2016 / Published: 9 July 2016
(This article belongs to the Special Issue Airborne Laser Scanning)

Abstract

:
Area-based analyses of airborne laser scanning (ALS) data are an established approach to obtain wall-to-wall predictions of forest characteristics for vast areas. The analyses of sparse data in particular are based on the height value distributions, which do not produce optimal information on the horizontal forest structure. We evaluated the complementary potential of features quantifying the textural variation of ALS-based canopy height models (CHMs) for both supervised (linear regression) and unsupervised (k-Means clustering) analyses. Based on a comprehensive literature review, we identified a total of four texture analysis methods that produced rotation-invariant features of different order and scale. The CHMs and the textural features were derived from practical sparse-density, leaf-off ALS data originally acquired for ground elevation modeling. The features were extracted from a circular window of 254 m2 and related with boreal forest characteristics observed from altogether 155 field sample plots. Features based on gray-level histograms, distribution of forest patches, and gray-level co-occurrence matrices were related with plot volume, basal area, and mean diameter with coefficients of determination (R2) of up to 0.63–0.70, whereas features that measured the uniformity of local binary patterns of the CHMs performed poorer. Overall, the textural features compared favorably with benchmark features based on the point data, indicating that the textural features contain additional information useful for the prediction of forest characteristics. Due to the developed processing routines for raster data, the CHM features may potentially be extracted with a lower computational burden, which promotes their use for applications such as pre-stratification or guiding the field plot sampling based solely on ALS data.

Graphical Abstract

1. Introduction

Airborne laser scanning (ALS; also referred to in some instances as “airborne scanning LiDAR”) has become a routinely operated technique for wall-to-wall prediction and mapping of various forest characteristics [1]. Due to straightforward implementation with ALS data acquired broadly for ground elevation modeling [2,3,4], so-called area-based approaches are the most established prediction techniques. By area-based approaches, we fundamentally refer to the two-stage procedure [5] in which: (i) models to predict the forest attributes of interest for the individual areas of interest (AOIs) are fit based on a set of training field plots; and (ii) the resulting models are applied to all the AOIs of the entire inventory area to produce wall-to-wall predictions. Operational implementations are elaborated upon in [6,7,8].
To develop predictive relationships between ALS and field training data, relevant features need to be extracted. The sparse pulse densities (typically < 1 m−2 according to [6,7]) allow extracting features related to the distributions of height values or proportions of certain types of echoes [5,9], which do not account for the full horizontal information available in the data. It is possible to extract structural and volumetric features from sparse, area-based data [10,11,12], whereas applying notably higher pulse densities has allowed understory- [13,14,15] or tree-level descriptions based on canopy density and height models [16,17,18]. Canopy height models (CHMs) are rasterized images representing interpolated difference in elevations between the top of the vegetation and ground level. Importantly, the sparse pulse density of ALS data constrains the CHM pixel size and therefore affects the accuracies of the subsequent forest attribute predictions. However, some researchers have related the CHM texture derived even from sparse ALS data to the properties of the growing stock such as the spatial pattern of the trees [19,20].
Texture analysis is a well-established field of image analysis with early forest applications for optical data acquired by varying sensors [21,22,23,24,25,26,27]. Additionally, Tuominen and Pekkarinen [28] provide a comprehensive overview of parameters affecting the prediction accuracies of most essential forest attributes based on very high-resolution image data in forest conditions closely corresponding to those of the present study. Textural features derived from the aerial images were used as predictor variables with ALS point-based features in [29,30]. Even though ALS-based CHMs were readily suitable for texture analyses, those did not appear until about a decade after the introduction of the ALS-based CHM products [31]. CHMs derived from high-density ALS data were analyzed for their texture in [32,33,34], and those were derived using practical sparse pulse densities in [19,20,35]. The applications are, to date, related to predicting species, diversity, and the spatial arrangement of the trees, whereas the relationships between the textural features derived from the CHMs and forest attributes are not comprehensively known like in the case of aerial images [28].
In all ALS studies cited in the previous paragraph, the characterization of the image texture was based on computing a gray-level co-occurrence matrix (GLCM) and a set of descriptive features following the principles presented by Haralick et al., in 1973 [36]. However, there are several alternative approaches to quantify image texture. The use of Gabor or wavelet filtering produced good results compared to using the GLCM features with other remotely sensed materials besides ALS data [37]. Autocorrelation functions or (semi-)variograms [38] were tested with optical images [39,40,41] and with ALS data [42]. Patch or landscape features [43] were proposed as coarse measures of CHM texture [35]. More detailed approaches, such as analyzing local microstructures [44] or geometrically [45,46,47] or topologically [48] driven partitioning of image patches, have been proposed as powerful texture discriminators. To the best of our knowledge, however, the latter approaches are tested for built material classification only. Thus, the suitability of these methods and sensitivity of their parameters for remotely sensed vegetation analyses cannot be deduced based on earlier studies.
In summary, various texture analyses can be identified in the literature to improve the assessments of forest growing stock attributes, but analyses considering aspects other than GLCM-based textural features, especially in the case of ALS-based CHMs, are lacking. Even though conventional point-based ALS features most likely produce the best prediction outcome due to their potential to depict multilayered vertical strata [49,50], the textural features may be expected to improve the predictions by allowing a more detailed description of the horizontal forest structure [19,20]. Beside the use of textural features in supervised learning approaches, the favorable computational properties of the CHMs [51] may open up possibilities for inventories, in which ALS data have been acquired but no field reference data yet exist [52]. In those cases, the texture features may be used for pre-stratification or guiding a later field inventory to cover the variation of forest attributes within the area, which is challenging but important [53,54,55]. Thus, testing textural features for both the supervised and unsupervised learning approaches is well reasoned.
The purpose of this study is thus to compare texture analysis methods on CHMs derived from practically available sparse, leaf-off ALS data. Appropriate methods for extracting the textural features were identified based on a comprehensive literature review. The features were related to selected forest biophysical properties measured from field reference plots with an area of 254 m2. The performance of these features was tested in: (i) a supervised prediction of the total stem volume, basal area, and mean diameter; and (ii) an unsupervised classification of the forest area for a simulated sampling of the field plots. For benchmarking purposes, the extracted features were compared to the most common point-based ALS predictor variables in the supervised predictions.

2. Materials and Methods

2.1. Experimental Data

2.1.1. Study Area

The study area is located in the southern boreal forest zone in Evo, Finland (61.19°N, 25.11°E). The area of altogether approximately 2000 ha is a part of a state-owned forest. The elevation is typically 125–145 m above sea level and mineral soils with gentle slopes prevail. The forest stands vary from natural to intensively managed in terms of their silvicultural status. The area is dominated by Scots pine (Pinus sylvestris L.) and Norway spruce (Picea abies [L.] H. Karst.), which altogether constitute approximately 84% of the growing tree stock, with minor proportions of deciduous trees mainly occurring below the dominant canopy.

2.1.2. ALS Data

The ALS data used in the study were acquired by the National Land Survey of Finland as a part of their data-acquisition campaign for creating a nationwide ground elevation model for Finland. The data were downloaded from a file service [56], from which they are available for free and with extensive permissions of use. The data were acquired in two separate campaigns: on 7 May 2012 with Optech ALTM Gemini scanner operated from a flying altitude of 1830 m, and on 13 May 2012 with Leica ALS50 scanner operated from 2200 m. In both the acquisitions, a scanning angle of ±20°, a ground footprint of approximately 50 cm, and otherwise similar scanning parameters were applied to yield a nominal pulse density of 0.7–0.8 m−2. Both the scanner systems recorded up to four echoes per each emitted pulse. The data provider had detected and classified the ground level, on which the normalization of the vegetation height values was based, using an adaptive filtering algorithm [57] implemented in TerraScan software (Terrasolid Ltd., Helsinki, Finland). As the data are meant specifically for ground elevation modeling, we assumed the accuracy of this classification to be appropriate for our purposes. Following similar data acquisition and processing principles, the standard errors of terrain elevation values were found to be in order of 15 cm [58].

2.1.3. Field Measurements

The field sample was selected based on auxiliary information from the ALS data [59]. Altogether, five ALS features were extracted to quantify height and canopy cover of the dominant vegetation, and density, diversity, and proportion of understory trees and shrub layer, determined as below 60% of the maximum height and below 5 m [59], respectively. These data were clustered, applying an unsupervised k-Means algorithm with equal weights for each feature to obtain an ALS-based stratification of forest structure. The strategy was found efficient to distribute the sample across the spatial, size, and age distributions of the tree stock.
A field measurement protocol based on circular plots with the radius of 9 m was applied on a total of 155 plots. The species and diameter-at-breast height (DBH) were measured for each tree with a DBH ≥ 5 cm. For each tree species of the plot, a tree with a DBH corresponding to the median tree was determined in the field and measured for height. These attributes were used as the median tree attributes per plot and species. The similarity of close surroundings of each circular reference plot was quantified by establishing an additional field plot 16 m from the original plot center in each cardinal direction (0°, 90°, 180°, and 270°). In these satellite plots, only DBHs were measured from tally trees selected employing a basal area factor of 2 m2a−1.
The positions of the field plot centers were measured with a Trimble® Nomad® 900G Global Positioning System and Global Navigation Satellite System receiver with an external antenna and battery. At least 100 signals were collected from the center of each plot, and its position was computed from these observations. The receiver was connected to a GSM phone, which provided a connection to a reference station for real-time differential correction to help the positioning in the field. A differential post-processing was finally applied using Trimble® Pathfinder® Office and a local, permanent base station.

2.1.4. Field Plot Attributes Derived for the Evaluation

Plot-level forest attributes were compiled and aggregated from the tree-level measurements. Plot basal area (G) was computed by summing the DBH measurements. The variations in the within-plot DBH distribution and G among the main plot and the satellite plots were quantified using the coefficient of variation (CV), i.e., the ratio of the standard deviation to the mean of the specific values. The missing tree heights were predicted by calibrating the parameters of Näslund’s height curve [60] using the species-specific median tree diameter and height estimates [61]. The basal-area-weighted DBH (DgM) and height (HgM) were computed based on all the trees of a plot. The volumes of the individual trees were predicted, employing the DBH and height as predictors of species-specific equations [62]. The volume models for birch were used for all deciduous trees. The volumes (V) were summed to the plot level and scaled per ha. Descriptive statistics of the 155 reference plots measured are presented in Table 1.

2.2. Canopy Height Model (CHM) Generation

CHM rasters were generated based on the first-of-many and only echoes of the normalized ALS data, aiming to obtain the main information from the data while retaining most generalization abilities over sensors that record a different number of echo categories [6]. The low pulse density of ALS data was observed to restrict the realism of the CHM, as a requirement for a smaller pixel size would have meant that fewer CHM values were observed and more were interpolated. Consequently, different pixel sizes were tested and an inverse-distance weighting (IDW) technique [63] was applied to fill the pixels without observations. The IDW algorithm was selected due to its simplicity and controllability based on only few parameters (see further notes in Section 4). The proportions of the observed CHM values with the considered pixel sizes are shown on Table 2.
Using the six different pixel sizes (Table 2), the CHM values for all the pixels containing at least one ALS echo were initially filled by selecting the maximum height value of observations inside the pixel. A non-filled value at pixel S0 was interpolated using the observed CHM values z at pixels Si (Equation (1)), where λi is the weight calculated according to Equation (2) based on an interpolation parameter α and d0i as the distance between S0 and Si:
z ^ ( S 0 ) = i = 1 n λ i z ( S i )
λ i = d 0 i α / i = 1 n d 0 i α
The maximum distance between S0 and Si was set to 3 m, i.e., observations located further than 3 m from the center of an empty pixel had no effect to the interpolation of the specific CHM value. Values 0.1, 0.5, 1, 2, 3, …, 9, and 10 were tested for α. Figure 1 shows examples of CHMs derived using different pixel sizes and α = 5, which was used in most of the analyses.

2.3. Textural Feature Extraction from the CHM

Image texture analysis has been a subject of very extensive research since the 1970s—for a review, see, e.g., Chapter 2 in [64]. This makes an exhaustive comparison of all the published methods impossible. The properties of the algorithms therefore were analyzed by relating earlier literature to our research objectives, i.e., to model forest attributes aggregated from field plots with a size of 254 m2 based on the extracted textural features. Several aspects such as invariance and robustness of the extraction method relative to the properties of the analyzed images were considered when choosing the algorithms [65].
Texture analyses are typically labeled as: (i) statistical; (ii) geometrical; (iii) model-based; or (iv) signal processing [65,66], in which the discrimination between different textures is based on: (i) comparing the statistics computed over different (sub-)regions; (ii–iii) determining geometric or image-model-based primitives that compose the texture; or (iv) filtering the image in the spatial or frequency domain to be able to compute the frequency components of the image signal. Approaches (ii) and (iii) were deemed inappropriate for our purposes due to the requirement to assume a geometrical or image model behind the process that generated the texture, i.e., to assume the spatial and size variation in the trees of the plots. Approach (iv) was found to be sensitive to a particular frequency and orientation, and even if the invariance to these factors was achieved by a multichannel filtering approach [64], the method appeared as overly complicated for our purposes. Hence, all the methods included in the present comparison could be labeled as statistical texture analyses. The number of these methods was further reduced by a requirement to obtain rotation-invariant features, i.e., not having to assume that the surfaces analyzed are captured from the same viewpoint.
The methods selected for the analyses are listed and characterized according to the order and scale considered (Table 3). “Order” determines the immediate neighborhood of pixels considered as an individual observation, whereas “scale” indicates the spatial domain in which the features are computed from the individual observations. First-order statistics are based on individual pixels, whereas second-order statistics also account for the spatial co-occurrence of the pixels [66]. The features considered here are extracted from the area of full plots (“global” scale) or the plots re-scaled using the GLCMs (“macro” scale) or detailed local neighborhoods (“micro” scale). The features were extracted from a circular window with a size of 254 m2 corresponding to the reference plots with 9 m radii. In addition, a 30 m × 30 m square window centered to the location of the plots were tested as an alternative computation unit in particular to evaluate the prediction of the CV of G among the satellite plots.
First-order statistics computed in the global scale are assumed to be neutral to the rotations of the image, whereas the use of second-order statistics (based on co-occurrence of two pixels) requires considering the rotation. In the GLCM approach [36], the rotation invariance is obtained by averaging the features extracted from the GLCMs over all directions. In addition to these second-order, macro-scale statistics, detailed micro-scale statistics were derived by computing gray-scale changes in local neighborhoods using the local binary pattern (LBP) approach [44]. In LBP, the rotation invariance is obtained by computing occurrence statistics for each individual rotation pattern of the neighborhood considered [44].
Since the use of higher-than-second-order statistics or other scales is rare in the literature, we believe that the results obtained by the selected methods (Table 3) can be extended to other texture quantification methods characterized by the same order and scale. The feature groups considered in the study are listed as follows.

2.3.1. HIST Features

The following first-order, general statistics were derived from the histograms of CHM pixel values: mean, standard deviation, and maximum (Table 3). These features explain vegetation height and its variation directly.

2.3.2. PATCH Features

The patch metrics are considered as coarse measures of texture that eventually produce information on the co-occurrence of groups of pixels obtained by thresholding an image. The spatial pattern of trees was modeled by thresholding CHMs for representing ground and tree patches [19,20]. In [19], a threshold of 5 m was used to assign all CHM values above that height as tree patches. In [20], an adaptive difference between the maximum height (hmax) and the standard deviation of the height values (hstd) was reported as the best canopy threshold for determining the spatial pattern of trees. Both the aforementioned constant and adaptive thresholds were tested in this study, and the patch features described in Table 3 were extracted from the thresholded CHMs.

2.3.3. GLCM Features

Textural features were computed from the GLCMs according to [36] by varying the lag parameter (i.e., the spatial interval between co-occurring pixel values) from 1 to 5. To get rotation invariant features, angular GLCMs were computed for four different offsets (0°, 45°, 90°, and 135°), and the mean values of the four angular textural features were analyzed. One out of the original 14 features [36], namely “Maximal Correlation Coefficient”, was not included in our analyses due to known computational instability problems associated with this feature. The derived features and their abbreviations are listed in Table 3. For more detailed descriptions, see [36].

2.3.4. LBP Features

The rotation invariant textural features based on the LBPs [44,67,68] are computed for every pixel by comparing the CHM value with its local circular neighborhood. The values are re-coded depending on the neighborhood: if the value of the pixel is equal or smaller than the value of its neighboring pixel, the local binary is coded as 1 and otherwise as 0. A single LBP of a pixel thus consists of the eight binary values derived from its circular neighborhood. Following [44], in which the uniformity of the image was measured by counting the number of bitwise 0/1 changes in the binary codes, we used the percentages of LBP patterns having uniformity of at most 0, 2, and 4 (abbreviated as U0, U2 and U4, respectively) as the extracted LBP features for the reference plots. All the LBP features were computed using the lags of 1, 2, and 3 CHM pixels.

2.3.5. Conventional Point-Based ALS Features

For benchmarking purposes, the most common point-based ALS predictor variables [5,9], i.e., the maximum, the mean, and the standard deviation of the height values; proportion of echoes above 2 m vegetation threshold; the 5th, 10th, 20th, …, 90th, and 95th percentiles; and the corresponding proportional densities of the ALS-based canopy height distribution were calculated according to [69] (pp. 502–503). The features were computed separately based on the first and last pulse data, i.e., “only” or “first of many” and “only” or “last of many” echoes, respectively, of up to 4 laser echoes recorded per pulse.

2.4. Supervised Prediction of Forest Attributes

The performance of the extracted features was first assessed in supervised prediction of forest attributes based on the observations made from the field-measured reference plots. The strength of the relationship between the CHM features and various forest inventory attributes was quantified using the coefficient of determination (R2, [70]). The main attention was focused on central attributes related to the properties of the forest growing stock, i.e., total stem volume (V), basal area (G), and basal-area weighted mean diameter (DgM). Due to the earlier reported potential to link the textural features with the diversity of the tree size and vegetation structure [34,71], we additionally analyzed the relationships between the textural features and the number of understory trees and variation of the DBH and G observed from within the 9 m plots and among the satellite plots, respectively.
The applicability of the extracted textural features and the benchmark variables was tested for predicting V, G, and DgM attributes of the sample plots, as these were found to be most related to the textural features in the initial tests. To normalize the potentially non-linear relationships between the forest attributes and the ALS features, three most common transformations (square, square root, and natural logarithm) of all the textural and point-based features were included as candidate predictors in addition to the absolute values in a simple linear regression (LR) analysis. The final predictors of the LR models were selected by inserting one of the predictor candidates at a time into each model and iteratively appending the feature that produced the best prediction accuracy. The LR models were formed applying a leave-out-one-plot cross-validation, i.e., excluding each plot at the time from the model fitting data and predicting for the excluded plot. The prediction accuracy was measured by root mean square error (RMSE) and relative root mean square error (RMSE%):
R M S E = i = 1 n ( y ^ i   y i ) 2 n
R M S E % = 100 *   R M S E y ¯
where yi is the observed value of variable y on plot i, ŷi is the predicted value of variable y on plot i, ӯ is the mean of the observed values, and n is the number of reference plots.

2.5. Unsupervised Classification of the Forested Area

Second, the textural features were evaluated for an unsupervised classification of the forested area, subsequently prioritizing the plots to be measured for field reference data. Earlier studies have suggested that the information in the ALS data may be condensed to a few metrics [72,73], the partitioning of which will provide a stratification corresponding closely to the structural complexity observed in the field [59,74,75]. In this study, a similar partitioning was carried out using the textural features, and the applicability of the obtained information was demonstrated by prioritizing the field plots to be measured for predicting plot V using other features.
The data were stratified using an unsupervised k-Means algorithm [76] implemented in R statistical computing environment [77]. The algorithm partitions the observations into k clusters such that the sum of squares from the observations to the assigned cluster centers is minimized. The clustering was based on the standard Euclidean distance and initialized by altogether 10,000 random solutions for the initial cluster centers to minimize the randomness in the final result. To use the standard Euclidean distance, the selected variables were required to have low inter-correlation to minimize redundancy of the information and thereby ensure an equal weight of each variable in the unsupervised classification. For this reason, the unsupervised classification was based on altogether six textural features hand-picked to capture the main variation in the forest structural characteristics but to have as low inter-correlation as possible. To include all the ALS features with the same importance in the clustering, the original feature values were normalized between 0 and 1. The normalized feature values were obtained according to Equation (5):
r i j = q i j q i m i n q i m a x q i m i n
where rij and qij are the normalized and original values of the jth observation of feature i and q i m i n and q i m a x are the smallest and largest values of i among all plots.
Since it was found problematic to determine a fixed k, i.e., to fix the expected value of the structural classes, we determined the final partitioning according to the persistence of the clusters on an interval of different k values. Based on initial tests in the data studied, the clustering was repeated by gradually increasing the value of k from 2 to 7, using an R package clue [78] to manage the resulting ensemble of the k-Means partitions. During the iterations, it was recorded whether the individual cells persisted in a cluster or shifted from one cluster to another one. As a result, each grid cell was labeled with an identifier describing its path along the clustering hierarchy. Therefore, as opposed to results with a fixed k, the final partitioning was based on the composition of the sub-clusters formed during the clustering. More details on the applied methodology are provided by [59].
The applicability of the information obtained by the stratification was demonstrated by prioritizing the sample plots to be measured for field reference data. The accuracy of the predictions depends on how extensively the reference data represent the full range of the phenomenon to be modeled [52]. We assume that the textural variation extracted from the CHMs is related to the real-world variation of the forest attributes and therefore indicative of which plots include the extreme variation that must be captured in the modeling data to produce realistic predictions. This hypothesis was tested by simulating the sampling of a reduced number of field reference plots for predicting V (cf. [52,53,54]), i.e., applying Probability Proportional to Size (PPS) sampling for the selection of the first-stage sample units [79].
The non-linear relationship between aggregated stem volume (V) and a product of ALS-point-based mean height (H) and canopy cover (CC) was modeled as:
V = a + b ( H × C C ) c
where a, b, and c were model parameters solved using the nls function of R [77]. The plots to be included in the modeling data were prioritized according to the distance to their respective cluster center. The distribution of the data to the different clusters was assumed to indicate different forest structure types and the distance to the cluster center to indicate whether a plot represented an extreme (high distance) or a typical observation of the cluster. The plots were ordered according to the decreasing distance and inserted one by one to the data used for fitting Equation (6). The RMSE of Equation (6) was recorded after including each plot.

3. Results

3.1. Effects of the CHM Parameters to the Textural Features

The spatial interpolation of unknown CHM values was based on the premise that CHM values of two pixels are related to each other, and the similarity is inversely related to the distance between their locations. In the IDW method, this was affected by the parameter α: the larger the value, the less weight was given for observations located far from the unknown CHM pixel. Figure 2 shows how the R2 of various textural features and total stem volume were affected, when the values of α were 0.1, 0.5, 1, 2… 9, and 10 with the pixel size of either 0.5 m or 1.0 m. The rest of the analyses were performed using α = 5, as with the pixel size of 0.5 m the highest correlations were obtained with that particular parameter α value, and no obvious improvements by other parameter values were discovered with the pixel size of 1.0 m.
The pixel size affected the textural features and subsequently the prediction of forest attributes. To find the optimal pixel size, we evaluated R2 of the selected textural features and the forest attributes with varying pixel sizes (Figure 3). Several important textural features had the strongest correlations with stem volume and basal area when the pixel size was 0.5–1.0 m. No significant differences between these levels could be observed based on a visual analysis of Figure 3, but since a higher number of pixel values were observed on the latter (Table 2), the pixel size was set to 1.0 m in the following analysis.

3.2. The Degree of Determination between the Textural Features and Forest Attributes

The R2 of selected textural features and the most common structural forest attributes are presented in Table 4, while the R2 of all features are presented as Supplementary Data. For benchmarking purposes, selected point-based features are also shown. None of the ALS-based features correlated well with CV(DBH) or the number of understory trees on the field plots. The observed R2 of the textural features and CV(G) on the large plots were more promising, as some of the GLCM features explained approximately one third of the variation in CV(G). Nevertheless, obtaining approximately the same degree of determination between the textural features extracted from the 9 m plots and CV(G) of the satellite plots does not support the causality of this observation. Thus, the rest of the analyses are focused on the accuracies of the tree-based forest attributes (i.e., V, G, and DgM).
The extracted textural features were fairly insensitive to the computation parameters used. The patch features explained the variation of V and G more clearly when the constant threshold (5 m) was used instead of the adaptive threshold. Some GLCM-derived features such as SAVG correlated well with V and G (respective R2 of 0.46 and 0.55 regardless of the lag) but always poorly with the DgM. The lag parameter of the GLCM features affected only a few of the relationships between the GLCM features and forest attributes. In particular, when a larger GLCM lag value (from 1 to 5) was used for calculating the CON feature, the stronger R2 was observed with the DgM (from 0.01 to 0.22). LBP-based CHM uniformity features U0 and U2 were also dependent on G having the dyadic R2 up to 0.20.
Among all ALS-based variables, CHMmean was the best feature for predicting V and G with respective R2 of 0.70 and 0.63 (Table 4). In addition, TP5m_% predicted the respective forest attributes better than any single point-based feature (R2 of 0.47 and 0.61). H_meanFP and VegRFP were the best point-based features for predicting V and G with R2 of 0.43 and 0.58, respectively. In addition, based on graphical comparisons with the point-based mean height features, CHMmean had a stronger but also more non-linear relationship with V and G. A square root transformation was required to fix the linearity of the relationship and improved the R2 of both CHM and point-based features. Conversely, the point-based features were better for predicting DgM (the best feature H70FP had an R2 of 0.74 with DgM) compared to the best textural features (CHMmax and CHMstd with R2 of 0.69 and 0.68 with DgM, respectively).

3.3. Prediction Accuracies Based on the Linear Regression Models

The CHM features and the point-based features were applied for predicting V, G, and DgM using LR. The best prediction accuracies of V (RMSE% of 25.0%) and G (RMSE% of 20.9%) were achieved by combining both textural and point-based features, whereas an optimal prediction model of DgM (RMSE% of 12.9%) was based only on point-based features (Table 5).
When all ALS-based predictor features were applied for the LR model of V, as an example, CHMmean, TPad_%, U0,lag = 1, VegRLP, H_meanLP, SVARlag = 1, H10LP, TPad_std, and TPad_avg were the first 10 predictor features selected to the model (Table 5). A similar combination of the different feature groups was used for modeling G, whereas the DgM models were more frequently based on the point features. The results suggest that combining different feature groups improves the prediction accuracies. Considering the interactions between the features in a feature selection procedure also results in a slightly different ranking of the features than the correlation analyses above.

3.4. Unsupervised Classification

The partitioning of the 155 plots to 2–7 clusters according to the selected features (CHMmean, CHMstd, TPad_avg, ASMlag = 1, IMC2lag = 1, and U2,lag = 2) resulted in altogether 27 sub-clusters in the clustering hierarchy. Excluding clusters formed by single plots, the composition of the remaining nine sub-clusters ordered according to CHMmean and labeled as A–I are shown in Supplementary Data (Figure S1) in terms of the textural feature values that generated the clusters.
When examined against the field measurements, the clustering could be linked with forest size-related attributes, particularly the aggregated stem volume, basal area, and mean diameter (Supplementary Data, Figure S2). Cluster A included the most of the youngest stands, but it was not completely uniform because of considerable variation in diameter distribution. Maturity and stocking increased from cluster B to E. Cluster F included the most stocked stands in the data, and F–I were the most prominent stands in terms of maturity for harvesting. Except for Cluster A, the clusters could not be linked to dominant species or the heterogeneity in the within-plot diameter or between-plot basal-area distributions.
The example of using the cluster dispersion obtained solely from the ALS data as a criterion to prioritize the selection of the sample plots for model fitting is illustrated in Figure 4. The RMSE of V based on applying Equation (6) to the full data considerably varied when inserting the initial plots (Figure 4b). The left column of Figure 4 suggests plots with the highest priorities in terms of the cluster dispersion are needed to capture the correct form of the relationship, whereas adding plots with lower priorities does not add information on the relationship. The RMSE stabilized after including 70 plots with the highest priority (Figure 4b).

4. Discussion

Unlike in earlier studies based on ALS, we performed a comprehensive comparison of methods for extracting and quantifying the textural variation present in the CHMs derived from a boreal forest area. Although earlier studies had related the textural variation of aerial images to size [28] and diversity [71] attributes of the vegetation and explained the effects of the parameters affecting the textural features extracted from aerial images [28], a new study was justified since the CHMs interpolated from the height values differ considerably from the spectral response recorded by the aerial images. The CHMs directly reflect the size and structure variation in the forest and were expected to thus readily improve the forest attribute prediction, eluding problems related to variations in geometry and radiometry of the spectral images [80]. Further, a considerably wider range of textural features was included in our analyses compared to [28]. The results are based on practical data, which are broadly available due to the frequent acquisitions of such data for ground elevation modeling. The results of this study may be seen as a continuation of studies promoting the usability of such data for various forest inventory applications [2,3,4,12,59,81].
While an exhaustive comparison of the texture quantification methods was found to be impossible due to the multitude of the developed methods, we attempted to select the most important ones based on a literature review of the properties of the algorithms. It could be argued whether the selection of the methods covered by our analyses was representative. However, our problem of relating the extracted features to the forest attributes aggregated from field plots with an area of 254 m2 placed some constraints on the methods. A particular emphasis was placed on the rotation invariance, which removed the requirement to run the training procedure applying the same rotation. On the other hand, the use of the isotropic features may have caused a loss of directionality from these features, which could have been an important characteristic when classifying the texture.
Assuming an underlying spatial pattern or geometric model of the trees would have enabled the use of additional approaches to quantify the texture. However, there is no evidence that the spatial pattern of the trees could be correctly classified based on remotely sensed data [20]. Even if such a classification were useful for forestry applications, departures from the assumptions could cause severe error trends [82]. The inherent spatial autocorrelation of the data further prevented the use of some texture quantification approaches. Given these challenges, it was well reasoned to focus on the statistical texture analyses and the selected combinations of order and scale (Table 3). Note that similar considerations on the feasibility of texture analyses and roles of assumptions and spatial scales were already discussed based on optical satellite images [83].
The generation of CHMs from the low-density ALS data affected the textural features derived using the selected methods. A part of the textural variation in the CHMs originates from the choice of whether to interpolate pixel values or increase the pixel size in order to use more observed height values. The use of small pixel sizes required a number of pixels to be interpolated, whereas increasing the pixel size would have allowed more pixel values to be based on real height observations. On the other hand, the degree of detail of the CHM reduced according to the pixel size. Nevertheless, Figure 1 indicates that the vertical and horizontal variation in the forest structure, as captured by the ALS point data, can be reproduced and analyzed as textural variation in the CHMs. An obvious solution to add the level of detail of the CHMs would be to increase the point density of the data. Data with a higher density might improve the texture analyses, but also allow the use of more efficient analysis methods such as detecting individual tree tops and using them as predictors [18].
The sparse point density led to, at best, a large CHM pixel size. Increasing the pixel size resulted in an increase of the CHM values observed, but a loss of information in terms of the detail. At small pixel sizes (0.5 or 1.0 m), the R2 of textural features and forest attributes were stable, whereas at larger pixel sizes their R2 degraded. The IDW interpolation parameter α only had a minor impact on the textural features computed in the study. Notably, many other techniques could have been employed in the interpolation of the CHM rasters. Acknowledging that the effects of interpolation to the quality of the CHMs should also be further studied, we selected to fill the CHMs using IDW interpolation due to its simplicity and controllability based on only few parameters. It would further have been possible to fill the CHMs using more than only or first echoes per emitted pulse, which has become popular especially when the detail of the CHM must enable individual tree detection [18]. Even though such alternative may be recommendable for practical purposes, the results obtained here using only first echo data are more likely reproducible with sensors that record a different number of echo categories [6].
The performance of the textural features derived using the selected pixel sizes (0.5 or 1.0 m) was moderately good for predicting V, G, and DgM. Among the tested feature sources, those based on the HIST, PATCH, and GLCM clearly outperformed those based on LBP. The HIST features derived from the grey-level histograms directly characterized the forest canopy height based on the CHMs. CHMmean was clearly the best individual feature for predicting stem volume and basal area among all the ALS-based features. CHMstd and CHMmax were almost as good as the point-based features to explain the variation of mean DBH. This is slightly surprising, since the CHM features should overall include less information compared to the height features computed from the entire point cloud. The reasons for the more stable relationship with the forest attributes may be related to the smoothing due to the CHM interpolation step. A similar result was reported in [4] and the potential to improve forest variables this way should be further examined. Overall, our results indicate that the description of the forest structure may be considerably simplified using the CHMmean as a substitute (or complement) for the point-based features. For improved accuracies, one should combine the CHM height features with the density features derived from the point data.
Among other textural feature groups, the PATCH features were useful in the prediction of the volume and basal area. The features derived using adaptive CHM threshold appeared as important predictors of the volume, whereas the basal area was better modeled when a constant threshold was applied. Among the GLCM features, SAVGlag = 1 was the best individual feature, while SVARlag = 1 was most often selected to the LR models from the set of other predictor candidates. Conversely, the uniformity features derived from the LBPs did not stand out with respect to any application considered. This is also slightly surprising, since LBPs should fundamentally detect more detailed microstructures than the aforementioned techniques. Analyzing the individual LBPs directly using more complex machine learning methods might provide better results. The poor performance of the method may also be related to the low resolution of the data analyzed. Nevertheless, the result may also indicate that texture quantification methods performing well in man-made material classifications may underperform in the case of vegetation with considerable natural variations.
Interestingly, the textural features sometimes outperformed point-based features regardless of which echo types were used for the computation. The use of the area-based approach with point-based canopy height and density features is a common and established practice for predicting forest inventory attributes from the ALS data [5,8]. Mainly features based on the vertical distribution of the ALS point cloud are used, however, for which reason these analyses optimally characterize only the vertical canopy conditions. The textural features of the present study were computed from the interpolated CHMs, aiming to improve the characterization of horizontal canopy conditions in ALS forest inventories. From a forest management planning perspective, the horizontal information in terms of the stem density and clustering is related to the timing of thinnings, and the inclusion of the textural features is proposed to improve these decisions [19]. Although Ozdemir and Donoghue [34] and Wood et al. [71] indicated a potential to link the textural features with the diversity of the tree size and vegetation structure, respectively, our results did not support this. The textural features were poorly correlated with the coefficients of variation of the DBH and G values as well as with the number of understory trees. The features mainly described the properties of the trees in the dominant canopy such as the total stem volume, basal area, and mean diameter. Direct measures of the understory are difficult to obtain due to transmission losses occurring in the upper canopy [84] (see also [50]) and no correlations between the textural features quantifying the top of the canopy and the understory were observed here. Although a slight correlation between the textural features and CV(G) computed from the large plots was observed, a similar correlation was obtained when the textural features were computed from a smaller window. A better solution for quantifying the within-stand variation in a practical prediction based on the grid cells could be to analyze the variation between the neighboring cells. A higher number of cells would also likely be needed to make solid conclusions.
In addition to the linear regression analyses, the textural features were demonstrated in an unsupervised classification using the well-known k-Means approach. The approach provided a data-driven partitioning of the feature space to the given number of clusters. As there are no objective criteria for determining the number of clusters [74], we followed an earlier example on determining this number [59]. By initially experimenting with different numbers of clusters, we found that 4–7 structural classes could be separated based on our study area and data. There are five classes in both the site type and development stage classification system applying to the studied forest area. However, not all classes are likely present and the data may not discriminate between all these properties. Therefore, the applied split and merge criteria resulted in a reasonable number of clusters.
Notably, the results obtained here are a re-clustering of the entire study area [59]. Whether applied to the full study area, it is reasonable to expect more clusters to be found similar to [59]. Nevertheless, the main purpose of this analysis was to determine if the developed features could be used for pre-stratification of the inventory area based on ALS data, as proposed in the introduction section. It was found that the dispersion of the clusters derived in an unsupervised mode could be used as an indicator for prioritizing the plots to be measured as the sample to form the reference data for the wall-to-wall models. We acknowledge that these are indicative results and the generalizability of this conclusion should be verified in more extensive data, allowing a split to training and validation data. For example, the textural features derived show a preliminary potential for streamlining the sampling design of the field plots to be used for later inventories, which should be further studied.
Practical inventory projects of large areas will most likely need to increasingly account for computation costs related to extracting the features. We see a considerable potential of the features developed here related specifically to their computational feasibility. For example, when designing the sampling protocol to obtain the field reference data for an ALS-based forest inventory, the sizes and orientations of potential plot locations should not be fixed but instead allowed to vary in order to design a field sample capturing the essential variation of the forest. Whether guiding the field plot sampling requires extracting features from, e.g., moving windows of multiple scales, the related computations are likely much more efficient when based on CHMs rather than underlying point data due to the efficient processing routines developed for raster data. Specifically, the time complexity related to CHM-based features reduces from the number of points to the number of pixels of the CHM, which appears to be a considerable factor in large inventory areas. Relating the computation costs of extracting various features to the added value obtained could be a potentially interesting future topic, since at least we are not aware of studies related to the time and space complexity of the ALS algorithms.

5. Conclusions

CHMs interpolated from the ALS data were found to reflect some degree of textural variation that was useful for modeling the underlying forest attributes, particularly plot volume and basal area. Applications based on supervised (linear regression) and unsupervised (k-Means clustering) learning of forest structure were demonstrated. The latter indicates the utility of the derived features for improving the sampling of the field plots for forest inventories, which should be further verified with separate validation data. Among the features considered, the statistical, patch, and GLCM features outperformed those based on point data, indicating that improved information is contained in the textural features. Features based on LBP were less useful for the same purpose. Although the tested features were selected based on a comprehensive review of potential methods to quantify image texture, we acknowledge that many more textural feature extraction techniques could be considered, especially if the requirement to produce rotation-invariant features was relaxed. In all, even the sparse-density data include potential to develop features that quantify very different aspects of the data, which should be employed together to improve analyses of vertical and horizontal forest structure.

Supplementary Materials

The following are available online at www.mdpi.com/2072-4292/8/7/582/s1, Table S1: Coefficients of determination (R2) between 87 CHM textural features and various forest attributes; Table S2: Coefficients of determination (R2) between 52 point-based features and various forest attributes; Figure S1: The distribution of the textural features CHMmean, CHMstd, TPad_avg, ASMlag = 1, IMC2lag = 1 and U2,lag = 2 in the sub-clusters formed by the k-Means algorithm; Figure S2: The distribution of the field-measured forest structural attributes according to the sub-clusters formed based on the ALS data.

Acknowledgments

The contribution of Mikko T. Niemi was enabled by the funding of Metsämiesten Säätiö Foundation. The field measurements, preparation of this study, and open access publishing costs were financed by the Research Funds of the University of Helsinki. The study is a contribution to Academy of Finland, project 272195. The authors would like to thank Joni Imponen and Mika Salmi for their contributions related to the field measurements and Markus Holopainen and Mikko Vastaranta for discussions related to the study.

Author Contributions

Jari Vauhkonen conceived and designed the experiments; Mikko T. Niemi computed the plot-level forest attributes, implemented the CHM generation and textural feature extraction, and analyzed the data with respect to the supervised prediction of forest attributes; Jari Vauhkonen analyzed the data with respect to the unsupervised classification method; and Mikko T. Niemi and Jari Vauhkonen wrote the paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Maltamo, M.; Næsset, E.; Vauhkonen, J. Forestry Applications of Airborne Laser Scanning—Concepts and Case Studies, 1st ed.; Springer: Dordrecht, The Netherlands, 2014. [Google Scholar]
  2. Nord-Larsen, T.; Schumacher, J. Estimation of forest resources from a country wide laser scanning survey and national forest inventory data. Remote Sens. Environ. 2012, 119, 148–157. [Google Scholar] [CrossRef]
  3. Villikka, M.; Packalén, P.; Maltamo, M. The suitability of leaf-off airborne laser scanning data in an area-based forest inventory of coniferous and deciduous trees. Silva Fenn. 2012, 46, 99–110. [Google Scholar] [CrossRef]
  4. Kankare, V.; Vauhkonen, J.; Holopainen, M.; Vastaranta, M.; Hyyppä, J.; Hyyppä, H.; Alho, P. Sparse density, leaf-off airborne laser scanning data in aboveground biomass component prediction. Forests 2015, 6, 1839–1857. [Google Scholar] [CrossRef]
  5. Næsset, E. Predicting forest stand characteristics with airborne scanning laser using a practical two-stage procedure and field data. Remote Sens. Environ. 2002, 80, 88–99. [Google Scholar] [CrossRef]
  6. Næsset, E. Area-based inventory in Norway—From innovation to an operational reality. In Forestry Applications of Airborne Laser Scanning—Concepts and Case Studies, 1st ed.; Maltamo, M., Næsset, E., Vauhkonen, J., Eds.; Springer: Dordrecht, The Netherlands, 2014; pp. 215–240. [Google Scholar]
  7. Maltamo, M.; Packalen, P. Species-specific management inventory in Finland. In Forestry Applications of Airborne Laser Scanning—Concepts and Case Studies, Managing Forest Ecosystems, 1st ed.; Maltamo, M., Næsset, E., Vauhkonen, J., Eds.; Springer: Dordrecht, The Netherlands, 2014; pp. 241–252. [Google Scholar]
  8. White, J.C.; Wulder, M.A.; Varhola, A.; Vastaranta, M.; Coops, N.C.; Cook, B.D.; Pitt, D.; Woods, M. A Best Practices Guide for Generating Forest Inventory Attributes from Airborne Laser Scanning Data Using an Area-Based Approach (Version 2.0); Information Report FI-X-010; The Canadian Wood Fibre Centre: Victoria, BC, Canada, 2013. [Google Scholar]
  9. Magnussen, S.; Boudewyn, P. Derivations of stand heights from airborne laser scanner data with canopy-based quantile estimators. Can. J. For. Res. 1998, 28, 1016–1031. [Google Scholar] [CrossRef]
  10. Vauhkonen, J.; Seppänen, A.; Packalén, P.; Tokola, T. Improving species-specific plot volume estimates based on airborne laser scanning and image data using alpha shape metrics and balanced field data. Remote Sens. Environ. 2012, 124, 534–541. [Google Scholar] [CrossRef]
  11. Vauhkonen, J.; Næsset, E.; Gobakken, T. Deriving airborne laser scanning based computational canopy volume for forest biomass and allometry studies. ISPRS J. Photogramm. Remote Sens. 2014, 96, 57–66. [Google Scholar] [CrossRef]
  12. Vauhkonen, J.; Holopainen, M.; Kankare, V.; Vastaranta, M.; Viitala, R. Geometrically explicit description of forest canopy based on 3D triangulations of airborne laser scanning data. Remote Sens. Environ. 2016, 173, 248–257. [Google Scholar] [CrossRef]
  13. Vehmas, M.; Packalén, P.; Maltamo, M.; Eerikäinen, K. Using airborne laser scanning data for detecting canopy gaps and their understory type in mature boreal forest. Ann. For. Sci. 2011, 68, 825–835. [Google Scholar] [CrossRef]
  14. Bouvier, M.; Durrieu, S.; Fournier, R.A.; Renaud, J. Generalizing predictive models of forest inventory attributes using an area-based approach with airborne LiDAR data. Remote Sens. Environ. 2015, 156, 322–334. [Google Scholar] [CrossRef]
  15. Latifi, H.; Heurich, M.; Hartig, F.; Müller, J.; Krzystek, P.; Jehl, H.; Dech, S. Estimating over-and understorey canopy density of temperate mixed stands by airborne LiDAR data. Forestry 2016, 89, 69–81. [Google Scholar] [CrossRef]
  16. Lee, A.C.; Lucas, R.M. A LiDAR-derived canopy density model for tree stem and crown mapping in Australian forests. Remote Sens. Environ. 2007, 111, 493–518. [Google Scholar] [CrossRef]
  17. Ferraz, A.; Mallet, C.; Jacquemoud, S.; Rito Gonçalves, G.; Tomé, M.; Soares, P.; Gomes Pereira, L.; Bretar, F. Canopy density model: A new ALS-derived product to generate multilayer crown cover maps. IEEE Trans. Geosci. Remote Sens. 2015, 53, 6776–6790. [Google Scholar] [CrossRef]
  18. Hyyppä, J.; Yu, X.; Hyyppä, H.; Vastaranta, M.; Holopainen, M.; Kukko, A.; Kaartinen, H.; Jaakkola, A.; Vaaja, M.; Koskinen, J.; Alho, P. Advances in forest inventory using airborne laser scanning. Remote Sens. 2012, 4, 1190–1207. [Google Scholar] [CrossRef]
  19. Pippuri, I.; Kallio, E.; Maltamo, M.; Peltola, H.; Packalén, P. Exploring horizontal area-based metrics to discriminate the spatial pattern of trees and need for first thinning using airborne laser scanning. Forestry 2012, 85, 305–314. [Google Scholar] [CrossRef]
  20. Packalén, P.; Vauhkonen, J.; Kallio, E.; Peuhkurinen, J.; Pitkänen, J.; Pippuri, I.; Strunk, J.; Maltamo, M. Predicting the spatial pattern of trees by airborne laser scanning. Int. J. Remote Sens. 2013, 34, 5154–5165. [Google Scholar] [CrossRef]
  21. Cohen, W.B.; Spies, T.A. Estimating structural attributes of Douglas-fir/western hemlock forest stands from Landsat and SPOT imagery. Remote Sens. Environ. 1992, 41, 1–17. [Google Scholar] [CrossRef]
  22. Franklin, S.E.; Waring, R.H.; McCreight, R.W.; Cohen, W.B.; Fiorella, M. Aerial and satellite sensor detection and classification of western spruce budworm defoliation in a subalpine forest. Can. J. Remote Sens. 1995, 21, 299–308. [Google Scholar] [CrossRef]
  23. Wulder, M.A.; Franklin, S.E.; Lavigne, M.B. High spatial resolution optical image texture for improved estimation of forest stand leaf area index. Can. J. Remote Sens. 1996, 22, 441–449. [Google Scholar] [CrossRef]
  24. Franklin, S.E.; Hall, R.J.; Moskal, L.M.; Maudie, A.J.; Lavigne, M.B. Incorporating texture into classification of forest species composition from airborne multispectral images. Int. J. Remote Sens. 2000, 21, 61–79. [Google Scholar] [CrossRef]
  25. Franklin, S.E.; Wulder, M.A.; Gerylo, G.R. Texture analysis of IKONOS panchromatic data for Douglas-fir forest age class separability in British Columbia. Int. J. Remote Sens. 2001, 22, 2627–2632. [Google Scholar] [CrossRef]
  26. Muinonen, E.; Maltamo, M.; Hyppänen, H.; Vainikainen, V. Forest stand characteristics estimation using a most similar neighbor approach and image spatial structure information. Remote Sens. Environ. 2001, 78, 223–228. [Google Scholar] [CrossRef]
  27. Anttila, P. Nonparametric estimation of stand volume using spectral and spatial features of aerial photographs and old inventory data. Can. J. For. Res. 2002, 32, 1849–1857. [Google Scholar] [CrossRef]
  28. Tuominen, S.; Pekkarinen, A. Performance of different spectral and textural aerial photograph features in multi-source inventory. Remote Sens. Environ. 2005, 94, 256–268. [Google Scholar] [CrossRef]
  29. Maltamo, M.; Malinen, J.; Packalén, P.; Suvanto, A.; Kangas, J. Nonparametric estimation of stem volume using airborne laser scanning, aerial photography, and stand-register data. Can. J. For. Res. 2006, 36, 426–436. [Google Scholar] [CrossRef]
  30. Packalén, P.; Maltamo, M. Predicting the plot volume by tree species using airborne laser scanning and aerial photographs. For. Sci. 2006, 52, 611–622. [Google Scholar]
  31. Hyyppä, J.; Inkinen, M. Detecting and estimating attributes for single trees using laser scanner. Photogramm. J. Finland 1999, 16, 27–42. [Google Scholar]
  32. Vauhkonen, J.; Tokola, T.; Maltamo, M.; Packalén, P. Effects of pulse density on predicting characteristics of individual trees of Scandinavian commercial species using alpha shape metrics based on airborne laser scanning data. Can. J. Remote Sens. 2008, 34, S441–S459. [Google Scholar] [CrossRef]
  33. Heinzel, J.; Koch, B. Investigating multiple data sources for tree species classification in temperate forest and use for single tree delineation. Int. J. Appl. Earth Obs. Geoinfo. 2012, 18, 101–110. [Google Scholar] [CrossRef]
  34. Ozdemir, I.; Donoghue, D.N.M. Modelling tree size diversity from airborne laser scanning using canopy height models with image texture measures. For. Ecol. Manage. 2013, 295, 28–37. [Google Scholar] [CrossRef]
  35. Pippuri, I.; Suvanto, A.; Maltamo, M.; Korhonen, K.T.; Pitkänen, J.; Packalen, P. Classification of forest land attributes using multi-source resolution data. Int. J. Appl. Earth Obs. Geoinfo. 2016, 44, 11–22. [Google Scholar] [CrossRef]
  36. Haralick, R.M.; Shanmugam, K.; Dinstein, J. Textural features for image classification. IEEE Trans. Syst. Man Cybernetics 1973, 3, 610–621. [Google Scholar] [CrossRef]
  37. Ruiz, L.A.; Fdez-Sarría, A.; Recio, J.A. Texture feature extraction for classification of remote sensing data using wavelet decomposition: A comparative study. In Proceedings of the 20th ISPRS Conference, Istanbul, Turkey, 12–23 July 2004.
  38. Cressie, N. Statistics for Spatial Data: Wiley Series in Probability and Statistics; John Wiley & Sons: Hoboken, NJ, USA, 1993; pp. 105–209. [Google Scholar]
  39. St-Onge, B.A.; Cavayas, F. Estimating forest stand structure from high resolution imagery using the directional variogram. Int. J. Remote Sens. 1995, 16, 1999–2021. [Google Scholar] [CrossRef]
  40. St-Onge, B.A.; Cavayas, F. Automated forest structure mapping from high resolution imagery based on directional semivariogram estimates. Remote Sens. Environ. 1997, 61, 82–95. [Google Scholar] [CrossRef]
  41. Wulder, M.; Boots, B. Local spatial autocorrelation characteristics of remotely sensed imagery assessed with the Getis statistic. Int. J. Remote Sens. 1998, 19, 2223–2231. [Google Scholar] [CrossRef]
  42. Wallerman, J.; Holmgren, J. Estimating field-plot data of forest stands using airborne laser scanning and SPOT HRG data. Remote Sens. Environ. 2007, 110, 501–508. [Google Scholar] [CrossRef]
  43. Riitters, K.H.; O'Neill, R.V.; Hunsaker, C.T.; Wickham, J.D.; Yankee, D.H.; Timmins, S.P.; Jones, K.B.; Jackson, B.L. A factor analysis of landscape pattern and structure metrics. Landsc. Ecol. 1995, 10, 23–39. [Google Scholar] [CrossRef]
  44. Ojala, T.; Pietikäinen, M.; Mäenpää, T. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Patt. Anal. Mach. Intell. 2002, 24, 971–987. [Google Scholar] [CrossRef]
  45. Varma, M.; Zisserman, A. A statistical approach to texture classification from single images. Int. J. Comput. Vis. 2005, 62, 61–81. [Google Scholar] [CrossRef]
  46. Varma, M.; Zisserman, A. A statistical approach to material classification using image patch exemplars. IEEE Trans. Patt. Anal. Mach. Intell. 2009, 31, 2032–2047. [Google Scholar] [CrossRef]
  47. Crosier, M.; Griffin, L.D. Using basic image features for texture classification. Int. J. Comput. Vis. 2010, 88, 447–460. [Google Scholar] [CrossRef]
  48. Perea, J.A.; Carlsson, G.A. Klein-Bottle-Based dictionary for texture representation. Int. J. Comput. Vision 2014, 107, 75–97. [Google Scholar] [CrossRef]
  49. Zimble, D.A.; Evans, D.L.; Carlson, G.C.; Parker, R.C.; Grado, S.C.; Gerard, P.D. Characterizing vertical forest structure using small-footprint airborne LiDAR. Remote Sens. Environ. 2003, 87, 171–182. [Google Scholar] [CrossRef]
  50. Maltamo, M.; Packalén, P.; Yu, X.; Eerikäinen, K.; Hyyppä, J.; Pitkänen, J. Identifying and quantifying structural characteristics of heterogeneous boreal forests using laser scanner data. For. Ecol. Manag. 2005, 216, 41–50. [Google Scholar] [CrossRef]
  51. Liu, X. Airborne LiDAR for DEM generation: Some critical issues. Progr. Phys. Geogr. 2008, 32, 31–49. [Google Scholar]
  52. Maltamo, M.; Bollandsås, O.M.; Næsset, E.; Gobakken, T.; Packalén, P. Different plot selection strategies for field training data in ALS-assisted forest inventory. Forestry 2011, 84, 23–31. [Google Scholar] [CrossRef]
  53. Dalponte, M.; Martinez, C.; Rodeghiero, M.; Gianelle, D. The role of ground reference data collection in the prediction of stem volume with LiDAR data in mountain areas. ISPRS J. Photogramm. Remote Sens. 2011, 66, 787–797. [Google Scholar] [CrossRef]
  54. Gobakken, T.; Korhonen, L.; Næsset, E. Laser-assisted selection of field plots for an area-based forest inventory. Silva Fenn. 2013. [Google Scholar] [CrossRef]
  55. Maltamo, M.; Ørka, H.O.; Bollandsås, O.M.; Gobakken, T.; Næsset, E. Using pre-classification to improve the accuracy of species-specific forest attribute estimates from airborne laser scanner data and aerial images. Scand. J. For. Res. 2015, 30, 336–345. [Google Scholar] [CrossRef]
  56. National Land Survey of Finland. File Service of Open Data. Available online: https://tiedostopalvelu.maanmittauslaitos.fi/tp/kartta?lang=en (accessed on 12 May 2016).
  57. Axelsson, P. DEM generation from laserscanner data using adaptive TIN models. Int. Arch. Photogramm. Remote Sens. 2000, 33, 16–22. [Google Scholar]
  58. Vilhomaa, J.; Laaksonen, H. Valtakunnallinen laserkeilaus—Testityöstä tuotantoon (in Finnish for “National laser scanning—From testing to production”). Photogramm. J. Finland 2011, 22, 82–91. [Google Scholar]
  59. Vauhkonen, J.; Imponen, J. Unsupervised classification of airborne laser scanning data to locate potential wildlife habitats for forest management planning. Forestry 2016. [Google Scholar] [CrossRef]
  60. Näslund, M. Skogsförsöksanstaltens gallringsförsök i tallskog (in Swedish for “Forestry Institute’s thinning trial in a pine forest”). Meddelanden 1936, 29, 169. [Google Scholar]
  61. Siipilehto, J. Improving the accuracy of predicted basal-area diameter distribution in advanced stands by determining stem number. Silva Fenn. 1999, 33, 281–301. [Google Scholar] [CrossRef]
  62. Laasasenaho, J. Taper curve volume functions for pine, spruce and birch. Comm. Inst. For. Fenn. 1982, 108, 1–74. [Google Scholar]
  63. Lu, G.Y.; Wong, D.W. An adaptive inverse-distance weighting spatial interpolation technique. Comput. Geosci. 2008, 34, 1044–1055. [Google Scholar] [CrossRef]
  64. Wu, J. Rotation Invariant Classification of 3D Surface Texture Using photometric Stereo. Ph.D. Thesis, Heriot-Watt University, Edinburgh, UK, 2003. [Google Scholar]
  65. Ojala, T.; Pietikäinen, M. Texture Classification. In CVonline—Compendium of Computer Vision; Fisher, R.B., Ed.; Available online: http://homepages.inf.ed.ac.uk/rbf/CVonline/LOCAL_COPIES/OJALA1/texclas.htm (accessed on 12 May 2016).
  66. Tuceryan, M.; Vain, A.K. Texture analysis. In Handbook of Pattern Recognition and Vision; Chen, C.H., Pau, L.F., Wang, P.S.P., Eds.; World Scientific: Singapore, Singapore, 1993; pp. 235–276. [Google Scholar]
  67. Ahonen, T.; Pietikäinen, M. Face description with local binary patterns: Application to face recognition. IEEE Trans. Patt. Anal. Mach. Intell. 2006, 28, 2037–2041. [Google Scholar] [CrossRef]
  68. Guo, Z.; Zhang, L.; Zhang, D. A completed modeling of local binary pattern operator for texture classification. IEEE Trans. Image Proc. 2010, 19, 1657–1663. [Google Scholar]
  69. Korhonen, L.; Peuhkurinen, J.; Malinen, J.; Suvanto, A.; Maltamo, M.; Packalén, P.; Kangas, J. The use of airborne laser scanning to estimate sawlog volumes. Forestry 2008, 81, 499–510. [Google Scholar]
  70. Nagelkerke, N.J.D. A note on a general definition of the coefficient of determination. Biometrika 1991, 78, 691–692. [Google Scholar] [CrossRef]
  71. Wood, E.M.; Pidgeon, A.M.; Radeloff, V.C.; Keuler, N.S. Image texture as a remotely sensed measure of vegetation structure. Remote Sens. Environ. 2012, 121, 516–526. [Google Scholar] [CrossRef]
  72. Kane, V.R.; McGaughey, R.J.; Bakker, J.D.; Gersonde, R.F.; Lutz, J.A.; Franklin, J.F. Comparisons between field-and LiDAR-based measures of stand structural complexity. Can. J. For. Res. 2010, 40, 761–773. [Google Scholar] [CrossRef]
  73. Leiterer, R.; Furrer, R.; Schaepman, M.E.; Morsdorf, F. Forest canopy-structure characterization: A data-driven approach. For. Ecol. Manag. 2015, 358, 48–61. [Google Scholar] [CrossRef]
  74. Pascual, C.; García-Abril, A.; García-Montero, L.G.; Martín-Fernández, S.; Cohen, W.B. Object-based semi-automatic approach for forest structure characterization using lidar data in heterogeneous Pinus sylvestris stands. For. Ecol. Manag. 2008, 255, 3677–3685. [Google Scholar] [CrossRef]
  75. Thompson, S.D.; Nelson, T.A.; Giesbrecht, I.; Frazer, G.; Saunders, S.C. Data-driven regionalization of forested and non-forested ecosystems in coastal British Columbia with LiDAR and RapidEye imagery. Appl. Geogr. 2016, 69, 35–50. [Google Scholar] [CrossRef]
  76. Hartigan, J.A.; Wong, M.A. A k-means clustering algorithm. Appl. Stat. 1979, 28, 100–108. [Google Scholar] [CrossRef]
  77. R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2016. [Google Scholar]
  78. Hornik, K. A clue for cluster ensembles. J. Stat. Softw. 2005, 14, 1–25. [Google Scholar] [CrossRef]
  79. Pesonen, A.; Leino, O.; Maltamo, M.; Kangas, A. Comparison of field sampling methods for assessing coarse woody debris and use of airborne laser scanning as auxiliary information. For. Ecol. Manag. 2009, 257, 1532–1541. [Google Scholar] [CrossRef]
  80. Mäkinen, A.; Korpela, I.; Tokola, T.; Kangas, A. Effects of imaging conditions on crown diameter measurements from high-resolution aerial images. Can. J. For. Res. 2006, 36, 1206–1217. [Google Scholar] [CrossRef]
  81. Niemi, M.; Vastaranta, M.; Peuhkurinen, J.; Holopainen, M. Forest inventory attribute prediction using airborne laser scanning in low-productive forestry-drained boreal peatlands. Silva Fenn. 2015, 49. [Google Scholar] [CrossRef]
  82. Vauhkonen, J.; Mehtätalo, L. Matching remotely sensed and field-measured tree size distributions. Can. J. For. Res. 2015, 45, 353–363. [Google Scholar] [CrossRef]
  83. Woodcock, C.E.; Strahler, A.H. The factor of scale in remote sensing. Remote Sens. Environ. 1987, 21, 311–332. [Google Scholar] [CrossRef]
  84. Korpela, I.; Hovi, A.; Morsdorf, F. Understory trees in airborne LiDAR data—Selective mapping due to transmission losses and echo-triggering mechanisms. Remote Sens. Environ. 2012, 119, 92–104. [Google Scholar] [CrossRef]
Figure 1. (a) Points with height values >5 m above ground level extracted from an example plot. Canopy height models interpolated from the point data using pixel sizes of: (b) 0.5 m; (c) 0.75 m; (d) 1.0 m; (e) 2.0 m; and (f) 3.0 m. In (a), the sizes of the dots are scaled according to the height values, whereas the tick marks of each sub-plot correspond to a horizontal distance of 2 m.
Figure 1. (a) Points with height values >5 m above ground level extracted from an example plot. Canopy height models interpolated from the point data using pixel sizes of: (b) 0.5 m; (c) 0.75 m; (d) 1.0 m; (e) 2.0 m; and (f) 3.0 m. In (a), the sizes of the dots are scaled according to the height values, whereas the tick marks of each sub-plot correspond to a horizontal distance of 2 m.
Remotesensing 08 00582 g001
Figure 2. The R2 of selected textural features and total stem volume with the pixel sizes of 0.5 m and 1.0 m. The interpolation parameter (α) was set to 0.1, 0.5, 1, 2…, 9 and 10.
Figure 2. The R2 of selected textural features and total stem volume with the pixel sizes of 0.5 m and 1.0 m. The interpolation parameter (α) was set to 0.1, 0.5, 1, 2…, 9 and 10.
Remotesensing 08 00582 g002
Figure 3. The effect of the pixel size to the coefficients of determination (R2) of selected textural features and plot-level forest attributes (total stem volume, basal area, mean diameter (DBH), and DBH variation). The interpolation parameter α was set to 5.
Figure 3. The effect of the pixel size to the coefficients of determination (R2) of selected textural features and plot-level forest attributes (total stem volume, basal area, mean diameter (DBH), and DBH variation). The interpolation parameter α was set to 5.
Remotesensing 08 00582 g003
Figure 4. (a) The relationship between the airborne laser scanning estimate of mean height (H) × canopy cover (CC) and plot volume (V) in the data studied. The filled and open circles correspond to clusters including single or several plots, respectively, and the size of the open circles indicates the dispersion of the plot from the respective cluster center; (b) The development of the root-mean-squared-error, when the prediction of V is based on Equation (6) and an increasing number of sample plots for model fitting, selected in the order of the decreasing dispersion.
Figure 4. (a) The relationship between the airborne laser scanning estimate of mean height (H) × canopy cover (CC) and plot volume (V) in the data studied. The filled and open circles correspond to clusters including single or several plots, respectively, and the size of the open circles indicates the dispersion of the plot from the respective cluster center; (b) The development of the root-mean-squared-error, when the prediction of V is based on Equation (6) and an increasing number of sample plots for model fitting, selected in the order of the decreasing dispersion.
Remotesensing 08 00582 g004
Table 1. Central total and species-specific characteristics of the reference data, aggregated to the plot-level.
Table 1. Central total and species-specific characteristics of the reference data, aggregated to the plot-level.
AttributeRangeMeanSD *
V, m3ha−17.0–673.0184.0107.0
G, m2ha−11.3–48.019.89.0
DgM, cm8.4–42.722.96.9
HgM, m7.9–32.319.04.7
Vpine, m3ha−10–360.0101.089.0
Gpine, m2ha−10–33.110.68.7
DgM_pine, cm5.3–50.024.39.0
HgM_pine, m3.5–32.019.45.3
Vspruce, m3ha−10–464.053.887.9
Gspruce, m2ha−10–36.35.77.7
DgM_spruce, cm5.0–40.015.99.8
HgM_spruce, m4.0–32.513.77.6
Vdeciduous, m3ha−10–228.029.043.2
Gdeciduous, m2ha−10–24.93.54.9
DgM_deciduous, cm5.0–52.016.39.3
HgM_deciduous, m6.0–33.016.26.2
* Standard deviation.
Table 2. The proportion of observed canopy height model pixel values with the considered pixel size in the reference plots.
Table 2. The proportion of observed canopy height model pixel values with the considered pixel size in the reference plots.
Pixel SizeMeanSD *
0.5 m27%11%
0.75 m49%13%
1.0 m59%12%
1.5 m93%6%
2.0 m97%3%
3.0 m100%2%
* Standard deviation.
Table 3. Textural feature groups considered.
Table 3. Textural feature groups considered.
GroupOrderScaleImage Data SourceFeatures and Their Abbreviations
HIST1stglobalGray-level histogramMean (CHMmean), Standard deviation (CHMstd), Maximum (CHMmax)
PATCH2ndglobalThresholded image aProportion of field plot total area covered by tree patches (TPt_%), Number of tree patches per hectare (TPt_den), Average area of tree patches (TPt_avg), Standard deviation of the area of tree patches (TPt_std), Average number of pixels in 4-neighborhood belonging to the same tree patch (TPt_N)
GLCM2ndmacroGrey-level co-occurrence matrix bAngular second moment (ASM), Contrast (CON), Correlation (COR), Sum of squares: variance (VAR), Inverse difference moment (IDM), Sum average (SAVG), Sum variance (SVAR), Sum entropy (SENT), Entropy (ENT), Difference variance (DVAR), Difference entropy (DENT), 2 × Information measure of correlation (IMC1, IMC2)
LBP2ndmicroLocal binary pattern (LBP) cUniformity of a LBP computed as percentages of at most 0 (U0), 2 (U2) or 4 (U4) bitwise transitions in the binary code of a circular neighborhood
a In the abbreviations of the features, subscript t is either 5 m or ad, indicating the use of a constant (5 m) or adaptive (ad = hmax–hstd) height threshold, respectively; b All features were computed using a lag of 1, 2, 3, 4, or 5 pixels; c All features were computed using a lag of 1, 2, or 3 pixels.
Table 4. The coefficients of determination (R2) of selected features (17 textural features and three point-based features) and three forest attributes with the pixel size of 1.0 m and interpolation parameter of 5.
Table 4. The coefficients of determination (R2) of selected features (17 textural features and three point-based features) and three forest attributes with the pixel size of 1.0 m and interpolation parameter of 5.
FeatureVGDgM
CHMmean0.700.630.15
CHMstd0.080.000.68
CHMmax0.220.070.69
TP5m_%0.470.610.00
TP5m_N0.350.430.06
TPad_%0.320.280.04
TPad_N0.170.140.19
CONlag = 10.020.010.01
CONlag = 30.000.000.11
CONlag = 50.000.010.22
SAVGlag = 10.460.550.00
SAVGlag = 30.460.550.00
SAVGlag = 50.450.550.00
U0,lag = 10.130.180.00
U0,lag = 10.140.200.01
U2,lag = 10.060.120.00
U2,lag = 20.080.130.03
H_meanFP0.430.220.51
VegRFP0.380.580.00
H70FP0.230.070.74
Table 5. The plot-level prediction accuracy of total stem volume, basal area and mean diameter using different sets of predictor features and their transformations in linear regression analyses. Altogether, 10 predictors are listed from each feature group in the order they were selected (footnote). The predictors included in the models were the first 1, 2, 3, or 5 predictors mentioned.
Table 5. The plot-level prediction accuracy of total stem volume, basal area and mean diameter using different sets of predictor features and their transformations in linear regression analyses. Altogether, 10 predictors are listed from each feature group in the order they were selected (footnote). The predictors included in the models were the first 1, 2, 3, or 5 predictors mentioned.
PredictorsV, m3ha−1G, m2ha−1DgM, cm
RMSERMSE%RMSERMSE%RMSERMSE%
CHM Textural Features a
The best feature59.832.65.628.33.716.3
Set of 2 features54.329.65.226.23.415.0
Set of 3 features52.328.54.924.63.314.6
Set of 5 features50.627.64.823.93.113.7
Point-based Features b
The best feature81.644.45.729.03.314.4
Set of 2 features57.431.24.723.83.013.2
Set of 3 features55.230.04.723.53.013.0
Set of 5 features53.028.94.623.02.812.2
CHM Textural Features and Point-based Features Combined c
The best feature59.832.65.628.33.314.4
Set of 2 features54.329.64.924.53.013.2
Set of 3 features52.328.54.723.52.912.5
Set of 5 features49.326.84.421.92.711.8
a V = f(CHMmean + TPad_%2 + CHMmean2 + √U0,lag = 1 + SVARlag = 1 + TP5m_std2 + TPad_std2 + TPad_avg2 + CORlag = 1 + √TPad_std); G = f(CHMmean + TP5m_%2 + TPad_% + CHMmax2 + CORlag = 12 + √U0,lag = 1 + CORlag = 1 + TP5m_std + U0,lag = 22 + log(IMC2lag = 1)); D = f(CHMmax2 + CONlag = 5 + IMC1lag = 12 + CHMmean2 + √U0,lag = 1 + √U0,lag = 2 + log(SVAR) + √CHMmean + TPad_%2 + log(TPad_avg)); b V = f(H_meanFP2 + VegRFP2 + H10FP + H_meanLP2 + H10LP2 + VegRLP2 + D05LP2 + H_stdLP2 + H30FP + log(H05LP)); G = f(VegRFP2 + H_meanLP2 + H10lP2 + √VegRFP + log(H_meanFP) + VegRLP2 + H50LP2 + D05FP2 + H30FP + log(D90FP) + H_maxFP); D = f(H70FP2 + H60LP2 + H20FP + √H10LP + √H30FP + H60FP + log(D90FP) + D95FP2 + H90LP2 + H90FP2); c V = f(CHMmean + TPad_%2 + CHMmean2 + √U0,lag = 1 + VegRLP2 + H_meanLP2 + SVARlag = 1 + H10LP2 +TPad_std + TPad_avg2); G = f(CHMmean + VegRFP2 + H_meanLP2 + TPad_den2 + VegRLP2 + IDMlag = 32 + H10FP2 + √U0,lag = 1 + TP5m_N2 + H05LP2); D = f(H70FP2 + H60LP2 + log(IMC2lag = 1) + H10LP + TPad_%2 + log(TPad_avg) + √H20FP + √H30FP + H05FP2 + D50LP2).

Share and Cite

MDPI and ACS Style

Niemi, M.T.; Vauhkonen, J. Extracting Canopy Surface Texture from Airborne Laser Scanning Data for the Supervised and Unsupervised Prediction of Area-Based Forest Characteristics. Remote Sens. 2016, 8, 582. https://doi.org/10.3390/rs8070582

AMA Style

Niemi MT, Vauhkonen J. Extracting Canopy Surface Texture from Airborne Laser Scanning Data for the Supervised and Unsupervised Prediction of Area-Based Forest Characteristics. Remote Sensing. 2016; 8(7):582. https://doi.org/10.3390/rs8070582

Chicago/Turabian Style

Niemi, Mikko T., and Jari Vauhkonen. 2016. "Extracting Canopy Surface Texture from Airborne Laser Scanning Data for the Supervised and Unsupervised Prediction of Area-Based Forest Characteristics" Remote Sensing 8, no. 7: 582. https://doi.org/10.3390/rs8070582

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop