^{1}

^{1}

^{1}

^{1}

^{2}

^{3}

^{3}

This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (

This study evaluated the accuracy of boreal forest above-ground biomass (AGB) and volume estimates obtained using airborne laser scanning (ALS) and RapidEye data in a two-phase sampling method. Linear regression-based estimation was employed using an independent validation dataset and the performance was evaluated by assessing the bias and the root mean square error (RMSE). In the phase I, ALS data from 50 field plots were used to predict AGB and volume for the 200 surrogate plots. In the phase II, the ALS-simulated surrogate plots were used as a ground-truth to estimate AGB and volume from the RapidEye data for the study area. The resulting RapidEye models were validated against a separate set of 28 plots. The RapidEye models showed a promising accuracy with a relative RMSE of 19%–20% for both volume and AGB. The evaluated concept of biomass inventory would be useful to support future forest monitoring and decision making for sustainable use of forest resources.

Forests play an important role in global carbon cycling, since the world’s forests sequestrate and conserve more carbon than all other terrestrial ecosystems, and account for 90% of the annual carbon flux between the atmosphere and the Earth’s land surface [

The estimation of forest carbon is still relatively uncertain considering the errors in regional carbon stock estimates. Houghton [

Finland, the northernmost European country, has 73% of its land area forested [^{12} kg of carbon in the trees alone [

A wide range of approaches have been proposed for quantifying forest biomass using active and passive remote sensing systems. Although many alternative remote sensing techniques have been suggested for the estimation of forest attributes such as AGB, the most promising one seems to be the airborne laser scanning (ALS) [

The RapidEye is a multispectral satellite sensor launched on 2008 with an ability to acquire images with high spatial resolution (5 m) on five spectral bands. The RapidEye images have been used in several forestry operations, including cost-effective monitoring, harvesting and mapping [

The most commonly used features derived from the optical data to predict forest attributes are the spectral and textural features. Spectral features describe the tonal variation in portions of the electromagnetic spectrum. Textural features contain information about the spatial distribution of tonal variations within an image. Texture has qualities such as periodicity and scale; it can therefore describe, for example, the direction, coarseness, and contrast of image components [

Albeit the costs of ALS data have decreased considerably, it has relatively high cost compared to the optical sensors. Decreasing the number of field reference plots to minimize the costs of data collection is desirable, as the cost of field measurements was estimated to be around 100 € per sample plots (9 m radius) in Finland [

The objective of this study was to assess the AGB and volume using ALS data and RapidEye satellite data in a two-phase sampling procedure. ALS-estimated sample plots (called surrogate plots) were used as a simulated ground-truth instead of more expensive field sample plots. A small number of field sample plots were collected to create ALS-based models, which were applied to predict the AGB and volume for the surrogate plots. Finally, RapidEye image and the surrogate plots were used to generalize the predictions for the study area.

The accuracy of the predictions was evaluated by means of root mean square error (RMSE) and bias. To our knowledge, there are few studies focused on using ALS data as a simulated ground-truth for the satellite-based forest inventory (e.g., [

The study area is located in eastern Finland (62°31′N, 30°10′E) (

The field measurements were carried out in May to June 2010. Altogether, 78 field plots were placed subjectively into different stands in an attempt to record the species and size variation over the area. The field plots were placed into the area subjectively based on the development class and dominant tree species. The sizes of the field plots ranged from 20 × 20 to 30 × 30 m. Field sample plots were divided into training (n = 50) (phase I) and validation (n = 28) data sets; the training set comprised the plots whose size was 25 × 25 m (0.065 ha). The other plots with varying sizes were used for validation. It is worth noting that the larger sample plots maintain a greater amount of spatial overlap, minimize the edge effects, and increase the sample variances [

A total of 200 surrogate plots (35 × 35 m) were placed to cover the study area using an ortho-rectified aerial photograph with the green, red and near-infrared portions of the spectrum. These plots were used as a simulated ground-truth and while training the RapidEye models in the phase II (described in Section 3.4). We used visual interpretation of aerial photographs in the placement of the surrogate plots. The idea was that the surrogate plots should cover all the variation (

All trees with either diameter at breast height (DHB) ≥ 4 cm or height ≥ 4 m were measured in the field. The volumes of the individual trees were calculated as a function of diameter at breast height (DBH) and tree height using species-specific models [_{ki}_{0}, b_{1} and b_{2} are the vectors of fixed effects parameters, d_{ski} is (2 + 1.25 × DBH), h_{ki} is the height (m) of tree, u_{k} is the variance of random parameters, and e_{ki} is the residual error.

The ALS data was collected on 18 July 2009 using an Optech ALTM Gemini laser scanning system. The nominal pulse density was about 0.65 per square meter. The test site was scanned from an altitude that is approximately 2000 m above ground level with a field view of 30 degrees and side overlap between transects of 20%. Pulse repetition frequency was set to 50 kHz. The swath width was approximately 1,050 m.

The RapidEye satellite images were collected on 19 May 2012 for the test area. The RapidEye image index numbers were 2012-05-19T102327_RE5_3C-N05_9429301_137127 and 2012-05-19T102330_RE5_3C-N05_9429336_137127. RapidEye imagery provides multispectral optical imagery of five bands (blue 440–510 nm, green 520–590 nm, red 630–685 nm, red-edge 690–730 nm, and near infrared 760–850 nm). A total of two RapidEye images were collected with spatial resolution of five meters to cover the study area. All the RapidEye images were radiometrically and geometrically corrected (overall standard error was 0.53 m) according to the standard of RapidEye [

First points were classified as ground and non-ground hits according to the approach described by Axelsson [

The area-based method was used to model the relationships between field-measured variables (e.g., AGB) and canopy height/density metrics from the ALS data [

Before using the RapidEye images in the final calculation, the necessity of radiometric correction was examined according to the Ridge method [

The spectral and textural features were calculated from the RapidEye images and used as predictors for modelling. The extraction was done based on the field plot size and RapidEye image pixel size so that the extracted image value could properly represent the forest attributes (e.g., AGB) at the plot level. The spectral predictors were derived from each band of RapidEye image by taking the mean DN values based on the plot size. Three spectral vegetation indexes were computed from the RapidEye images. Normalized Difference Vegetation Index (NDVI) (

NDVI thus varies between −1.0 and +1.0,

Also, the red-edge and green band ratio in the RapidEye images has a good response to forest biomass. Thus, the second NDVI [

We ran the Pearson correlation test to confirm which NDVI has the better correlation with AGB and volume. The Pearson correlation showed that the first NDVI and third NDVI had better correlations (r = −0.32 and −0.34, respectively) compared to the second NDVI (r = −0.12). A total of 14 textural features [

The Haralick textural features were derived from the RapidEye images by considering the mean DNs based on the plot size. The co-occurrence matrix was used as an input to compute the Haralick’s features. We used single spectral bands (mean), three vegetation indices (e.g., normalized difference vegetation index) and Haralick textural features from RapidEye satellite data. Finally, a total of 36 RapidEye explanatory predictors (

Ordinary least squares (OLS) is the most commonly employed method for estimating the unknown parameters of a linear regression model. The OLS can be written as follows in _{i}_{i}_{i}_{i}_{i}β

As we had a large number of predictors, we used the ^{2} (coefficient of determination) value. Since the algorithm returns the best model of each size (number of predictors), the results do not depend on a penalty model for model size. In addition to stepwise variable selection, we used the Pearson correlation techniques and the maximum R^{2} improvement variable selection techniques to select ALS/RapidEye-derived variables to be included in the models. During the variable selection, no explanatory variable was left in the models with a partial ^{2} improvement techniques to search for the “best” one-variable model, the “best” two-variable model, and so forth, although it is not guaranteed that in finds the model with the highest R^{2} for each size. We repeated the selection 10 times by randomly selecting 50% of the observations for each repetition to produce stable models for our final regression models. Our final models contained the most frequently occurring ALS/RapidEye-derived variables. Næsset

The RMSE and relative RMSE (_{i}_{i}_{i}^{2} and adjusted R^{2}) describe the model fit. A higher adjusted R^{2} indicates a better model. The equation of adjusted R^{2} is following (

The precisions of the ALS models are detailed in

^{3}/ha (12%) for the models based on 10 training plots. In contrary, the highest 20% RMSE of AGB was employed for the models based on 25 training plots. In addition, the R^{2} of the model was 0.83 based on 10 training plots. A slightly better R^{2} value (0.85) was observed for volume. However, the models based on 25 training plots had a lowest R^{2} value for both AGB (0.65) and volume (0.61). The statistical outliers were not frequent in the each residual plot (^{2} value dropped from 0.83 to 0.50. Similarly, we added the ratio of the number of last and single echoes below 5 m (low vegetation) and the total number of last and single echoes in the AGB model. Similarly, the R^{2} value dropped to 0.66. Besides, the Pearson correlation between the used density metric and tree AGB, volume, tree height, basal area were respectively 0.84, 0.85, 0.82, 0.75, which confirms that this density metric explained very well the structure of study area. In contrast, the correlations between the ALS height percentiles of 60%, 70%, 80%, 90% and AGB were 0.60, 0.61, 0.62, and 0.64, respectively. In the case of volume, the correlations were higher for the ALS height percentiles and were respectively 0.68, 0.68, 0.69 and 0.70, which indicates that the ALS height percentiles correlate better with volume than AGB. The key of the used density metric must be the 13.5 m height threshold; it is much higher than what is typically used. 13.5 m seems to be close to the smallest mean height at the plots. We think that it is a combination of a small variation in the plot data and the unusually high threshold which somehow makes this variable special for this particular data set. It might also be possible to achieve these results in a managed commercial forest, where an increase in height would also result in an increase in density, so one density variable would be enough.

The error statistics of the linear models are given in

We calibrated the RapidEye models against ALS-trained set of surrogate plots using LOOV. The mean and standard deviation of AGB for the ALS-trained set of surrogate plots were 103 tons/ha and 34 tons/ha, respectively. We had a similar accuracy for all the combination of training plots number (10, 25, and 50). However, RapidEye prediction based on the 10 training plots against ALS estimation gave a lowest RMSE of 19 tons/ha (19%) for AGB. In addition, volume had a similar RMSE 38–41 m^{3}/ha (20%–21%) value from all the combination of training plots (

The accuracies at the validation plots are shown in ^{3}/ha (19%–20%). We found that these accuracies at the independent validation plots can be considered acceptable and are even better than in studies concerning conventional compartments-based field inventory in Finland [^{2} were 21% and 44%, respectively, for AGB.

This paper presented and tested forest AGB and volume estimation using ALS and RapidEye sensor data in a mixed-species boreal forest of eastern Finland. We found that the outcomes of the study were encouraging and we felt that they need further validation in other forest types along with development of the methodology. The overall strength of the ALS–RapidEye fusion revealed here is a promising accuracy to characterize biomass accounting in the coniferous forest ecosystems. The analysis of linear regression showed that the ALS data had good prediction accuracy at phase I. It was promising that the ALS models explained 83% (R^{2} value of 83%) of the variability for AGB. Furthermore, the RapidEye models provided a relatively good accuracy at phase II for AGB and as well as for volume. RapidEye data had a relative RMSE of 20% at independent validation plots. Such accuracy indicates that the combination of ALS and RapidEye would be a promising fusion for the estimation of boreal forest attributes. Nevertheless, more testing and validation should be done in different forest landscapes at large-scale.

We used the simple linear regression to predict biomass and volume which is one of the most commonly used methods in remote sensing-based regression modeling (e.g., [

Though we used regression subset selection techniques (

Our presented forest biomass inventory method naturally places some advantages and constraints on the applicability of the method. Albeit, it involves a sample ALS data at phase I and requires only a few sample plots for model calibration (fitting). In addition, the surrogate plots (phase II) could be placed systematically in the whole area based on vegetation types, geographic location, climatic condition and tree species compositions. Therefore, our calibrated models at phase II represent the whole study area and have less chance of missing the vegetation types and tree species’ compositions. The surrogate plots should be placed so that they reflect the full range of variation in biomass over the study area. In addition, the surrogate plots should cover also the rare forest types. Systematic selection or selection based on predicted AGB would be a natural choice. The locations of the surrogate plots could be selected through weighted random sampling or stratified sampling. Traditional forest inventory depends on a large set of ground-truth data for model calibration. However, this study showed a promising outcome where a sample ALS data was employed as a ground-truth data for model calibration with the satellite image for mapping the whole area of interest.

Another issue is that the satellite image acquisition dates may be different, introducing seasonal effects. Therefore, it would require the DN (digital number) values to be comparable between datasets which in turn would require the use of absolute reflectance instead of relative DN values. It would be recommended to use the same season for image acquisition. The result from

Our study showed that ALS-based forest inventory have produced very accurate estimates (phase I). ALS data have high accuracy to predict forest biomass when regressing ALS height/density metrics with data from field measured plots [

Reducing the number of training plots is of great importance to minimize field inventory cost. Our findings show that a reduced number of sample plots provides similar accuracy compared to the full dataset. When we reduced the sample plot from 50 to 10, there was no significant reduction in the prediction accuracy of AGB and volume (

The ALS models were calibrated against the field sample plots in phase I of the model building. The R^{2} value of our predicted forest AGB is close to the others studies ranged between 0.74 and 0.88 in the studies by Hall

The RMSE at the independent validation plots was lower than in most previous studies of boreal coniferous and mixed forest. Our study had a relative RMSE of 19%–20% for volume, whereas previous studies using the optical sensor had a relative RMSE of 42%–82% for volume, for instance, studies by Tokola and Heikkilä [

The relatively small errors in this study are probably due to the type of biomass being measured and the relatively good spatial and spectral resolution of the RapidEye data with high geometric accuracy. Hyyppä

We examined here the AGB and volume estimation employing ALS and RapidEye data in a two-phase sampling method. Linear regression was employed to predict forest characteristics in a managed boreal forest in eastern Finland. To conclude, the present study has confirmed that the accuracy of AGB and volume comparable to the forest inventory by compartments (or stand) in Finland. The approach presented here is a promising alternative for use in forest management in Finland with enough accuracy for the purpose of forest resource inventory. In addition, calibrating the RapidEye data using the ALS-simulated surrogate plots and different models at phases I and II make this approach promising and stable for biomass and carbon accounting in the boreal forest. It could also offer valuable methodology for inventories need in the REDD program. As the tested area is rather small, further validation is needed in a larger study area for better justification of the presented approach. Finally, we feel that future studies should be carried out using a combination of ALS and other optical sensors to confirm the validity of this approach in other forests landscape. This approach could be tested by tracking changes in forest biomass at the regional and national levels.

This research work was conducted at the University of Eastern Finland. It was funded by the Finnish Cultural Foundation and Centre for International Mobility (CIMO). The authors are grateful to these institutions for resources and funding. Last but not least, the comments provided by the three anonymous reviewers are gratefully acknowledged.

The authors declare no conflict of interest.

Training, surrogate and validation plots location and administrative map of Finland (left side), and RapidEye image with study area marked (right side).

Feature space images of each RapidEye band ((

Digital numbers in the overlapping area of the RapidEye blue band.

Flowchart of the two-phase sampling design.

The residuals plots of ALS predicted AGB at phase I.

The residuals plots of RapidEye for AGB at phase II.

The residual plots of RapidEye for AGB at validation plots.

Estimates of multivariate model fixed parameters, and variances of random stand parameters (u_{k}) and residual errors (e_{ki}) [

b_{0} |
−3.1 | −1.8 | −3.6 |

b_{1} |
9.5 | 9.4 | 10.5 |

b_{2} |
3.2 | 0.4 | 3.0 |

| |||

u_{k} |
0.009 | 0.006 | 0.00068 |

e_{ki} |
0.010 | 0.013 | 0.000727 |

The mean characteristics of the training plots and validation plots.

Total volume (m^{3}/ha) |
96.1 | 433.8 | 209.3 | 74.9 |

AGB (ton/ha) | 51.5 | 226.6 | 113.0 | 39.7 |

| ||||

Total volume (m^{3}/ha) |
103.6 | 382.5 | 219.1 | 69.0 |

AGB (ton/ha) | 55.9 | 182.2 | 115.7 | 31.5 |

List of airborne laser scanning (ALS) explanatory predictors.

1...10 | H_{fp10–100} |
Height for which the cumulative sum of ordered first and single echo heights is closest to 10%, 20%, 30%...100% of the total height sum. |

11...20 | H_{lp10–100} |
Height for which the cumulative sum of ordered last and single echo heights is closest to 10%, 20%, 30%...100% of the total height sum. |

21...23 | I_{30–90} |
Intensity for which the cumulative sum of ordered first and single echo intensities is closest to 30%, 60% and 90% of the total intensity sum. |

24...26 | I_{30–90} |
Intensity for which the cumulative sum of ordered last and single echo intensities is closest to 30%, 60% and 90% of the total intensity sum. |

27 | H_{mean} |
Mean height of first and single echo vegetation points (points over high vegetation threshold 5 m). |

28 | H_{std} |
Standard deviation of first and single echo heights. |

29 | D |
Ratio of the number of first and single echoes below 5 m (low vegetation) and the total number of first and single echoes. |

30 | D |
Ratio of the number of last and single echoes below 5 m (low vegetation) and the total number of last and single echoes. |

31...38 | D_{hlp0–7} |
Ratio of last and single echoes with height lower than 1.5 m + |

39...41 | D_{fp10,30,50} |
Ratio of first and single echoes with intensity I ≤ 0.5+ |

42...44 | D_{lp10,30,50} |
Ratio of last and single echoes with intensity I ≤ 0.5+i for |

45 | D_{flog} |
Logarithm of the ratio of the number of first and single echoes below 5 m (low vegetation) and the total number of first and single echoes. |

46 | H_{f3mean} |
Mean of the largest three heights within first and single echoes. |

List of RapidEye explanatory predictors.

B1 | Blue (mean) |

B2 | Green (mean) |

B3 | Red (mean) |

B4 | Red-edge (mean) |

B5 | NIR (mean) |

NDVI_{1} |
First NDVI (mean) |

NDVI_{2} |
Second NDVI (mean) |

NDVI_{3} |
Third NDVI (mean) |

HR1 | Angular second moment |

HR2 | Contrast |

HR3 | Correlation |

HR4 | Sum of squares |

HR5 | Inverse difference moment |

HR6 | Sum average |

HR7 | Sum variance |

HR8 | Sum entropy |

HR9 | Entropy |

HR10 | Difference variance |

HR11 | Difference entropy |

HR12 | Information measures of correlation |

HR13 | Information measures of correlation |

HR14 | Maximum correlation coefficient |

Accuracy of the ALS prediction against field estimation.

^{2} Adj. | ||||||||
---|---|---|---|---|---|---|---|---|

50 | 112.9 | 34.8 | 18.7 | 16.6 | 0.0 | 0.0 | 76.6 | |

Volume, m^{3}/ha |
209.2 | 66.2 | 34.5 | 16.5 | 0.0 | 0.0 | 77.7 | |

25 | 111.5 | 31.9 | 22.0 | 19.7 | 0.0 | 0.0 | 65.3 | |

Volume, m^{3}/ha |
204.1 | 56.2 | 41.7 | 20.4 | 0.0 | 0.0 | 61.9 | |

10 | 106.6 | 32.9 | 13.0 | 12.2 | 0.0 | 0.0 | 83.2 | |

Volume, m^{3}/ha |
198.8 | 67.7 | 25.1 | 12.6 | 0.0 | 0.0 | 85.0 |

above ground biomass. R^{2} Adj: R^{2} adjusted.

Accuracy of RapidEye prediction against ALS estimation at phase II.

^{2} Adj. | |||||||||
---|---|---|---|---|---|---|---|---|---|

50 | 200 | 102.7 | 26.2 | 21.0 | 20.4 | 0.0 | 0.0 | 59.5 | |

Volume, m^{3}/ha |
189.7 | 49.3 | 40.6 | 21.3 | 0.0 | 0.0 | 58.4 | ||

25 | 101.4 | 26.8 | 21.5 | 21.2 | 0.0 | 0.0 | 59.5 | ||

Volume, m^{3}/ha |
186.3 | 46.8 | 38.5 | 20.6 | 0.0 | 0.0 | 58.4 | ||

10 | 100.8 | 24.5 | 19.6 | 19.5 | 0.0 | 0.0 | 59.5 | ||

Volume, m^{3}/ha |
186.8 | 49.9 | 41.0 | 21.9 | 0.0 | 0.0 | 58.4 |

above ground biomass;

The number of training plots from phase I used in the estimation of 200 ALS surrogate plots for phase II. R^{2} Adj: R^{2} adjusted.

Accuracy of AGB and volume obtained from RapidEye at validation plots.

^{2} Adj. | |||||||||
---|---|---|---|---|---|---|---|---|---|

50 | 28 | 112.5 | 32.5 | 23.6 | 20.4 | −3.1 | −2.7 | 50.7 | |

Volume, m^{3}/ha |
207.5 | 59.4 | 43.2 | 19.7 | −11.4 | −5.2 | 54.0 | ||

25 | 111.5 | 33.3 | 24.1 | 20.8 | −4.1 | −3.5 | 50.7 | ||

Volume, m^{3}/ha |
203.2 | 56.4 | 44.3 | 20.2 | −15.7 | −7.1 | 54.0 | ||

10 | 110.0 | 30.4 | 23.3 | 20.2 | −5.6 | −4.8 | 50.7 | ||

Volume, m^{3}/ha |
204.9 | 60.1 | 44.3 | 20.1 | −14.1 | −6.4 | 54.0 |

above ground biomass;

The number of training plots used in the estimation of 200 ALS surrogate plots at phase II. R^{2} Adj: R^{2} adjusted.