Considerations towards a Novel Approach for Integrating Angle-Count Sampling Data in Remote Sensing Based Forest Inventories

Integration of remote sensing (RS) data in forest inventories for enhancing plot-based forest variable prediction is a widely researched topic. Geometric consistency between forest inventory plots and areas for extraction of RS-based predictive metrics is considered a crucial factor for accurate modelling of forest variables. Achieving geometric consistency is particularly difficult with regard to angle-count sampling (ACS) plots, which have neither distinct shape nor distinct extent. This initial study considers a new approach for integrating ACS and RS data, where the concept of ACS is transferred to RS-based metrics extraction. By using the relationship between tree height and diameter at breast height (DBH), pixels of a RS-based canopy height model are extracted if their value suggests a DBH that would lead to inclusion in an angle-count sample at the given distance to the plot centre. Different variations of this approach are tested by modelling timber volume in national forest inventory plots in Germany. The results are compared to those achieved using fixed-radius plots. A root mean square error of approximately 42% is achieved by both the new and fixed-radius approaches. Therefore, the new approach is not yet considered sufficient for overcoming all difficulties concerning the integration of ACS plot and RS data. However, possibilities for improvement are discussed and will be the subject of further research.


Introduction
The utilisation of remote sensing (RS) data-especially aerial images-for gaining information about forests has a long tradition [1], but for a long time, information extraction was laborious and expensive.Improved availability of digital image data and enhanced computational power, as well as advances in automatic image data processing, created new opportunities for fast and extensive information extraction.Dense image matching allows automatic generation of high-resolution digital surface models (DSM) from aerial stereo images, which enable the extraction of 3D data for forest stands.Another means of extracting this kind of information is the utilisation of airborne laser scanning (ALS), which directly measures 3D points for DSM generation.However, ALS data acquisition is more expensive than acquisition of aerial imagery.This reduces the availability and timeliness of ALS data compared to image data.Height information extracted from RS data sources was found to be well correlated with heights measured in the field e.g., [2].This relationship was further used to model and predict many other forest variables, among them timber volume and biomass e.g., [3].In forest inventories, this data can be used to improve the precision of forest variable estimates by combining very high resolution RS data with forest inventory plot data in a modelling process.Deo et al. [4] noted that in plot-based approaches, accurate statistical modelling for forest inventories requires geometric consistency between the terrestrial inventory data and RS data.That is, location, size, and shape of plots for inventory data collection have to correspond to the geometries used for RS data extraction.The size and shape of a sample plot, and the distribution of sampled and non-sampled trees within this plot, affect the fit between plot-based forest variables and metrics derived from remotely sensed data.Therefore, the configuration of sample plots-that is, the methods for selecting sample trees-is a factor that should be considered when combining inventory plot data with RS data.This greatly influences the predictive power of RS data metrics.
Fixed-area plots are considered "ideal when working with remotely sensed data" [4], as geometric inconsistencies tend to be minimal.This is very different for angle-count sampling (ACS), which is a well-established method for forest inventory data acquisition that does not utilise fixed-area plots.The ACS method was developed by Bitterlich in 1948 [5].Sample trees are selected with a probability proportional to their basal area, which is usually defined through the diameter at breast height (DBH), with a predefined basal area factor (BAF) used as weight [4][5][6].Therefore, some trees might be close to the sampling point, but are not sampled, while other trees further away will be sampled, resulting in sample plots of unknown and varying shape and size.In fact, ACS has no actual plot size [7] and the trees sampled might be different from the trees sampled using a fixed-area plot with the same plot centre [8].This leads to an increased geometric inconsistency between inventory and RS data, compared to fixed-area plots, impeding accurate statistical modelling in plot-based approaches [4].
In order to allow integration of ACS and RS data, the problem of geometric inconsistency needs to be attenuated.Finding the representative extent of ACS plots has been a topic of research for a long time.Matérn [7] aimed to find a plot size that allows ACS to be compared to fixed-area sampling with the same number and distribution of samples.He found that the plot area equals the ratio of the mean basal area to the BAF, requiring some knowledge about diameter distribution.
More recently, the motivation behind finding a suitable ACS plot size is to enable the utilisation of ACS data together with RS data.So far in the literature, mostly ALS data was used with this prospect, but there are also instances of utilising optical image data.Maack et al. [9] examined the variation of ACS radii and chose a fixed-radius of 20 m as a compromise between average ACS radius and sensor (ALS and Landsat 7) resolution.Hollaus et al. [10] determined an appropriate fixed-radius sample plot size by testing five different radii, choosing the one leading to the highest accuracy in a predictive timber volume model.A similar approach was chosen by [11].They selected the best suited plot size for timber volume determination by analysing the coefficient of determination and the standard deviation.Scrinzi et al. [12] used ACS data that were simulated using fixed-radius sampling data.They tested three approaches for approximating "variable pseudo areas" of ACS that are suitable for extraction of ALS data: using the fixed-radius plot area and two variations of the plot-size formula in [7].They found that ACS was marginally better for model calibration than fixed-radius sampling.Deo et al. [4] optimised the plot area for ALS data extraction for timber volume modelling depending on different BAFs.They noted that the optimum plot area for ACS inventories is dependent on forest structure and the chosen BAF.Maltamo et al. [6] used truncated ACS plots with a BAF of 2 and a maximal plot radius of 9 m.Thus, trees of smaller diameters (e.g., DBH ≤ 25.5 cm) are being sampled using the normal ACS approach.Larger trees (e.g., DBH > 25.5 cm), however, will be sampled comparable to fixed-radius sampling, because the maximal plot radius is the cut-off distance for sampling.Thus, for larger trees, the angle-count plot equals a fixed-radius plot [6].They modelled several plot-based variables (timber volume, basal area, number of stems, basal area median diameter, and basal area median height) and achieved Root Mean Square Errors (RMSEs) comparable to modelling using the fixed-radius plot data.They noted that ACS without truncation would lead to less accurate results and complicate the selection of a plot size for ALS data extraction.Immitzer et al. [13] used multiple, concentric circles of radii from 5-20 m for extracting metrics from optical stereo satellite images (WorldView-2).The combined metrics from all circles were joined with ACS data in a Random Forest model for wall-to-wall mapping of growing stock.They also produced results using several specific circle radii separately and found that using multiple radii combined led to lower RMSEs.This is explained by single circles not entirely meeting the design of ACS plots, with sample trees always being excluded or non-sample trees being included [13].
This study considers a novel approach for integration of ACS plot data and RS data for forest variable modelling.This approach aims to transfer the concept of ACS in the extraction of RS-based descriptive metrics in order to better meet the design of ACS plots.In this study, only 3D data (height) is extracted for modelling.

Study Site and Refernce Data
The study site is located in the south-west of Germany in the federal state of Baden-Württemberg (Figure 1), covering about 120,000 ha of the southern parts of the Black Forest and the Upper Rhine Plain.This results in very diverse topographic conditions, with flat areas in the Rhine plain, and hills with partly steep slopes in the Black Forest mountain range.Altitudes range between 205 m and 1220 m above sea level.Hence, the forested areas within the study site-which cover approximately 64,150 ha-vary greatly in terms of species composition.The lower regions in the Rhine plain are dominated by broadleaves (mainly Fagus sylvatica (26%), Quercus spec.(8%), and Fraxinus excelsior (3%)), while the upper regions are dominated by conifers (mainly Picea abies (30%), Abies alba (19%), and Pseudotsuga menziesii (9%)).However, many stands are a mixture of both conifers and broadleaves, to a varying degree.The age class composition of stands varies to a similar degree.to lower RMSEs.This is explained by single circles not entirely meeting the design of ACS plots, with sample trees always being excluded or non-sample trees being included [13].This study considers a novel approach for integration of ACS plot data and RS data for forest variable modelling.This approach aims to transfer the concept of ACS in the extraction of RS-based descriptive metrics in order to better meet the design of ACS plots.In this study, only 3D data (height) is extracted for modelling.

Study Site and Refernce Data
The study site is located in the south-west of Germany in the federal state of Baden-Württemberg (Figure 1), covering about 120,000 ha of the southern parts of the Black Forest and the Upper Rhine Plain.This results in very diverse topographic conditions, with flat areas in the Rhine plain, and hills with partly steep slopes in the Black Forest mountain range.Altitudes range between 205 m and 1220 m above sea level.Hence, the forested areas within the study site-which cover approximately 64,150 ha-vary greatly in terms of species composition.The lower regions in the Rhine plain are dominated by broadleaves (mainly Fagus sylvatica (26%), Quercus spec.(8%), and Fraxinus excelsior (3%)), while the upper regions are dominated by conifers (mainly Picea abies (30%), Abies alba (19%), and Pseudotsuga menziesii (9%)).However, many stands are a mixture of both conifers and broadleaves, to a varying degree.The age class composition of stands varies to a similar degree.The study site encompasses 552 woodland plots of the latest national forest inventory (NFI) in Germany, which was conducted 2011-2012.The sampling design of the NFI in Germany utilises a 4 km × 4 km grid aligned with the Gauss-Krüger coordinate system.In the federal state of Baden-Württemberg, however, the density of the grid is increased to 2 × 2 km.Each grid point defines the centre of one square inventory tract of 150 m × 150 m.The corner points of the tracts define the centres The study site encompasses 552 woodland plots of the latest national forest inventory (NFI) in Germany, which was conducted 2011-2012.The sampling design of the NFI in Germany utilises a 4 km × 4 km grid aligned with the Gauss-Krüger coordinate system.In the federal state of Baden-Württemberg, however, the density of the grid is increased to 2 × 2 km.Each grid point defines the centre of one square inventory tract of 150 m × 150 m.The corner points of the tracts define the centres of the NFI sampling plots.The plots were geolocated by the field crews using the MxBox Global Navigation Satellite System (GNSS) device of GEOSat GmbH [14].On each plot sample, trees are selected using ACS with a BAF of 4 and a minimum DBH of 7 cm.For each sample tree, azimuth and distance to the plot centre, as well as DBH and species is recorded, amongst other variables.Additionally, heights are measured for a subset of sample trees.The heights of the remaining sample trees are subsequently modelled.

Remote Sensing Data
This study utilised digital aerial stereo images acquired during the regular aerial surveys of the Baden-Württemberg land surveying authority (LGL).The aerial imagery is characterised by a forward overlap of 60%, side overlap of 30%, and a nominal ground resolution of 20 cm.The images were acquired by large size digital aerial matrix camera (UltraCam Eagle) in one survey flight in June 2013 with four spectral bands (red (R), green (G), blue (B), and near infrared (NIR), and a focal length of 79.8 mm.According to LGL, the orientation accuracy after aerotriangulation is 0.04 m in planimetric coordinates (x and y) and 0.14 m in height (z) on control points.The data delivery from LGL included the imagery and an Inpho project file that comprises the exterior and interior orientation, ready to be used for subsequent photogrammetric processing.In addition, a high quality digital terrain model (DTM) with 1 m resolution was also provided by LGL.The DTM was derived from ALS data collected between 2001 and 2004, with an approximate point density of 0.8 points per square meter.The nominal height accuracy of this ALS-DTM reported by LGL is 0.5 m or better [15].
A high resolution (0.4 m) DSM point cloud was generated from the stereo images in a dense image matching process using the software SURE of nFrames.Image matching is performed based on stereo image pairs and the results of multiple stereo pairs are subsequently fused [16].The implemented matching algorithm is a modification of the SGM algorithm proposed by [17].The software's default parameter set for aerial images of the given overlap was used.Following adaptions were made according to the experience of the research team with image matching for forest areas: triangulation will accept a maximum angle of 99 • and 1 as minimum detections per cell.The interpolation for the extraction of a regular gridded point cloud was done with inverse distance weighted.A detailed description of the matching algorithm implemented in SURE can be found in [16,18].
After point cloud extraction, all points identified as outliers were eliminated from each cloud.These outliers were identified by considering the maximum tree height at the study site and the ALS-DTM available as reference.All points with elevations of 55 m and more above, and 1 m and more below the DTM, were regarded outliers and excluded from the point cloud.The upper threshold is based on the assumption that trees in this area are usually not taller than 55 m.The lower threshold allows for some height inaccuracy in the point cloud.
A standard process was applied to generate a canopy height model (CHM) from each filtered point cloud.First, a raster-DSM with spatial resolution of 1 m (equivalent to the resolution of the ALS-DTM) was derived from each point cloud using the software LAStools [19].Each DSM pixel value was determined by the elevation of the highest point within the planimetric pixel extent (1 m 2 ).In areas without matching points, the pixel values were calculated via Triangular Irregular Network (TIN) streaming [20].The CHMs with 1 m resolution were obtained by subtracting the ALS-DTM from the photogrammetric DSMs.

Transfering ACS in Height Metric Extraction
Traditionally, plot-based metrics are extracted from raster data by selecting relevant pixels using a simple geometric shape-for example circle or rectangle-and in some cases, a pixel value (height) threshold.This is a suitable approach when the extraction geometry corresponds to the actual field sample plot geometry.However, ACS plots cannot be represented by such a simple geometry and pixels that do not correspond to a sample tree might be used for metric calculation.In addition, some sample trees might not be represented in the metrics if they are located outside the chosen geometric shape.In this initial study, a novel approach was devised and tested that seeks to extract only data that most likely corresponds to actual sample trees.The basic idea is to transfer the concept of sample tree selection in ACS.However, ACS requires some knowledge about the DBH of trees.This information is not available from airborne RS data and its derivatives such as CHMs.This is solved by utilising the relationship between DBH and tree height.So, for each pixel height a theoretical DBH can be calculated and compared to a minimum DBH, which depends on the horizontal distance between pixel location and plot centre.A single pixel does not represent a tree.However, a pixel of a certain height can be considered part of the crown of a tree that is an actual sample tree.This approach is expected to select at least the upper parts of sample tree crowns.For large trees (large basal area), more pixels will be selected than for smaller trees at the same distance from the plot centre.Nevertheless, all trees where at least some upper parts of the crown reach a certain height will be selected, with the amount of pixels selected being dependant on the tree size and its distance from the plot centre.Thus, the extraction of RS data mimics the sample tree selection process in ACS.This approach is detailed in the following sections.In this study, all data extraction, analysis, and modelling were implemented using the statistical software R version 3.3.1 [21].

Establishing Relationship between DBH and Tree Height
The relationship between DBH and tree height is a common and widely used factor in tree size determination [22].However, this relationship is influenced by other factors such as tree species, stand characteristics, and site conditions [23].In order to develop a practical data extraction approach that is applicable to a wide range of forest types and growing conditions, it is necessary to establish a generalised mathematical relationship between DBH and tree height.For this study, all tree heights and DBHs measured during the latest German NFI (BWI3) on plots located in Baden-Württemberg were included in a quantile regression analysis (Figure 2) using the interior point method for computing nonlinear quantile regression estimates as it is implemented in the function "nlrq" of the R-package "quantreg" [24].The resulting relationship between DBH and tree height is denoted in Equation ( 1) where DBH is the diameter at breast height, H NFI is the measured tree height and a and b are regression parameters that are dependent on the chosen quantile.The regression parameters were calculated for the 99%, 95%, 90%, 50%, 10%, 5%, and 1%-quantiles.However, no quantiles above 50% were chosen for further investigation, as this would assume relatively large diameters at a certain height.With quantiles above 50%, the DBH-height-relation would imply that more than 50% of all actually measured ACS trees have a DBH too small for inclusion in an angle-count sample.Furthermore, the 5%-quantile did not show any substantial difference to the 1%-quantile during analysis of results and was excluded in this publication in order to simplify the presentation of results.The parameters for the remaining quantiles of 50%, 10%, and 1%, which were used for further investigation, can be found in Table 1.

Data Extraction Algorithm
Pixel data at each NFI plot is extracted for each parameter set of the quantile regression separately using the following "transfer ACS" (tACS) approach.An initial search radius of 30 m around each plot centre was chosen, which is expected to cover all probable sample tree to plot centre distances in ACS with a BAF of 4. This decision was based on the fact that in the NFI data used in this study, no sample tree was more than 27 m distant from the plot centre.Choosing a 30 m radius allows extraction of all pixels belonging to a sample tree that is located at a distance of 27 m from the plot centre, with parts of its crown being even farther from the centre.For different data sets, one might consider using a different search radius.For each pixel where the centre coordinate falls within the search radius, the height value, as well as the coordinate were extracted and the distance to the plot centre was calculated.This distance was used for calculating a minimum DBH at each pixel by implementing Equation (2): where minDBHp is the minimum DBH at pixel p, distp is the distance between pixel p and the plot centre and BAF is the basal area factor, which in this study was set to 4. In cases where the thus calculated minDBH was below 7 cm-which is the minimum required DBH for NFI field sampling in Germany-the calculated value was overridden and minDBH set to 7 cm instead.Subsequently, a theoretical DBH (tDBH) was calculated for each pixel using Equation ( 1) and the respective parameters of Table 1.A pixel where tDBH is equal to or larger than minDBH is considered part of a tree that was probably sampled during NFI field data acquisition.Pixels where minDBH is not met are flagged as 'not sampled' and are eventually removed from the set of extracted pixels.All pixels that remain in the data set for each plot are used for metrics calculation (Section 2.4.1).Graphic

Data Extraction Algorithm
Pixel data at each NFI plot is extracted for each parameter set of the quantile regression separately using the following "transfer ACS" (tACS) approach.An initial search radius of 30 m around each plot centre was chosen, which is expected to cover all probable sample tree to plot centre distances in ACS with a BAF of 4. This decision was based on the fact that in the NFI data used in this study, no sample tree was more than 27 m distant from the plot centre.Choosing a 30 m radius allows extraction of all pixels belonging to a sample tree that is located at a distance of 27 m from the plot centre, with parts of its crown being even farther from the centre.For different data sets, one might consider using a different search radius.For each pixel where the centre coordinate falls within the search radius, the height value, as well as the coordinate were extracted and the distance to the plot centre was calculated.This distance was used for calculating a minimum DBH at each pixel by implementing Equation (2): where minDBH p is the minimum DBH at pixel p, dist p is the distance between pixel p and the plot centre and BAF is the basal area factor, which in this study was set to 4. In cases where the thus calculated minDBH was below 7 cm-which is the minimum required DBH for NFI field sampling in Germany-the calculated value was overridden and minDBH set to 7 cm instead.Subsequently, a theoretical DBH (tDBH) was calculated for each pixel using Equation ( 1) and the respective parameters of Table 1.A pixel where tDBH is equal to or larger than minDBH is considered part of a tree that was probably sampled during NFI field data acquisition.Pixels where minDBH is not met are flagged as 'not sampled' and are eventually removed from the set of extracted pixels.All pixels that remain in the data set for each plot are used for metrics calculation (Section 2.4.1).Graphic representations of the results of tACS pixel data extraction for three exemplary plots can be found in Figure 3.For comparison reasons, pixels were also extracted using two variations of the conventional method with fixed-radius plots.The radii used were derived in two different ways.In the first variation the maximum distance between plot centre and sample trees were derived for each plot individually (maxDistind).This maximum distance was measured during the terrestrial inventory measurements at each plot and was used as the fixed radius for the respective plot centre.Of course, this is not a viable approach for actual forest inventory applications, but is expected to represent an ideal plot size for reference.In the second variation the median of all maximum distances (of maxDistind) was calculated and used as fixed radius for all plot centres (maxDistmed).In this study the calculated radius for maxDistmed was 11.7 m.All pixels that are overlapped by a circle of the respective plot radius were extracted.In a second step all pixels with a height value below 6 m were removed from this set of pixels.Six metres is a threshold that resulted from the expert interviews of forest management planning at ForstBW-the Baden-Württemberg state office for forestry.It approximates For comparison reasons, pixels were also extracted using two variations of the conventional method with fixed-radius plots.The radii used were derived in two different ways.In the first variation the maximum distance between plot centre and sample trees were derived for each plot individually (maxDist ind ).This maximum distance was measured during the terrestrial inventory measurements at each plot and was used as the fixed radius for the respective plot centre.Of course, this is not a viable approach for actual forest inventory applications, but is expected to represent an ideal plot size for reference.In the second variation the median of all maximum distances (of maxDist ind ) was calculated and used as fixed radius for all plot centres (maxDist med ).In this study the calculated radius for maxDist med was 11.7 m.All pixels that are overlapped by a circle of the respective plot radius were extracted.In a second step all pixels with a height value below 6 m were removed from this set of pixels.Six metres is a threshold that resulted from the expert interviews of forest management planning at ForstBW-the Baden-Württemberg state office for forestry.It approximates the height of trees when they reach the minimum required DBH of 7 cm.This threshold was successfully applied in a project for improving forest management planning with RS methods at Forstliche Versuchs-und Forschungsanstalt Baden-Württemberg (FVA), Germany.It helped to improve models for timber volume estimation based on enterprise inventory data with sampling on concentric plots.Graphic representations of the results of the fixed-radius data extraction approaches for three exemplary plots can be seen in Figure 4.
In total five variations of CHM data extraction approaches were applied, resulting in five data sets.Three variations use transfer ACS (tACS q50 , tACS q10 , tACS q1 ) and two variations use circular geometries for data extraction (maxDist ind , maxDist med ).The variations are listed in Table 2. concentric plots.Graphic representations of the results of the fixed-radius data extraction approaches for three exemplary plots can be seen in Figure 4.
In total five variations of CHM data extraction approaches were applied, resulting in five data sets.Three variations use transfer ACS (tACSq50, tACSq10, tACSq1) and two variations use circular geometries for data extraction (maxDistind, maxDistmed).The variations are listed in Table 2.

Outlier Removal
NFI field data in this study was acquired between 2011 and 2012, while the aerial imagery was acquired in 2013.This time lag of 1-2 years can introduce a source of error due to tree growth and trees disappearing, which might be caused by forest operations or natural factors.Furthermore, insufficient spatial co-registration might lead to improper data being extracted.Plots that are affected

Outlier Removal
NFI field data in this study was acquired between 2011 and 2012, while the aerial imagery was acquired in 2013.This time lag of 1-2 years can introduce a source of error due to tree growth and trees disappearing, which might be caused by forest operations or natural factors.Furthermore, insufficient spatial co-registration might lead to improper data being extracted.Plots that are affected by this can be considered outliers, and should be identified and removed from the data set before establishing statistical models for forest variable estimation [25].In this study, outliers were removed by comparing the maximum measured tree height (max(H NFI )) at each plot with the maximum height in the CHM (max(H CHM )) of this plot.Plots where max(H CHM ) differed more than 25% from max(H NFI ) were considered outliers and removed from the data sets.The percentage of difference was calculated as follows (Equation ( 3)): where devH max is the percentage of difference, max(H NFI ) is the maximum measured height on the NFI plot, and max(H CHM ) is the maximum CHM height on this plot.The plot area for CHM data extraction was defined by the maximum measured tree to plot centre distance of the respective plot plus a 2 m-buffer.The maximum distance is measured to the centre of the respective tree, which is probably one of the largest trees in the plot.The size of the buffer was chosen in order to allow for inaccurate positioning of plot centres and non-verticality of tree stems.With this approach, 34 plots were identified as outliers and removed from the data sets, reducing the number of plots for metric calculation to 518.

Descriptive Metrics and Modelling
Descriptive metrics were calculated and timber volume models were created separately for each data set in Table 2.
where CHM ep are the extracted CHM pixels and res is the geometric resolution of the CHM.The volout metric is the volume between the outer canopy boundary and the maximum bounding box of the canopy [27] (Equations ( 5) and ( 6)): where maxVol CHM is the volume defined by the maximum height in the CHM, H ep are the heights of the extracted pixels, res is the geometric resolution of the CHM and count(ep) is the number of extracted pixels.
Both volume metrics provide information about the complexity of the outer canopy structure (Figure 5).Furthermore, the number of extracted pixels (npix) were derived and included as metric.
One additional plot-based metric that was not derived from the CHM, but from the DTM, is the mean ground elevation (meanDTM).This mean was calculated based on all DTM data within a radius of 30 m around each plot centre.This metric is the same for all six CHM data extraction approaches.

Modelling
In total, 22 metrics were derived for timber volume modelling (Section 2.4.1).Some of these metrics were highly correlated and a correlation analysis was conducted in order to avoid collinearity.Metrics that were correlated (Pearson) by more than 80% to each other were carefully examined and the metric that was most correlated with timber volume was kept.This analysis was conducted for all data sets and the results were combined in one initial set of metrics that were used for timber volume modelling.
Linear models were fitted to each data set.Linear models were chosen for all data sets to be able to compare the effect of transferring the concept of ACS to RS-assisted timber volume modelling.Nonlinearity in the data was handled by adding some of the predictor metrics additionally as squared terms to the models.The initial maximum models were stepwise reduced based on Akaike information criterion (AIC) using forward and backward selection.Furthermore, square root transformation of the response and weighted least squares were tested during modelling.For applying weights, it was necessary to account for plots where no pixels were selected and all metrics were 0. This was achieved by adding a small absolute term (0.01) to the denominator.This value is much lower than the measurement accuracy of heights within aerial image derived CHMs.

Validation
For each model, root mean square error percent (RMSE%) and bias% of model predictions were estimated for each fold in a 10-fold cross-validation as presented in Table 3.For each model, RMSE% and bias% values from each fold were averaged.Furthermore, standard deviation of bias% from each fold was calculated.
Several models with varying combinations of predictive metrics were tested for each data set and the model with lowest RMSE% was chosen as best model.These "best models" were used for comparing and assessing the performance of the different data extraction approaches.
These best performing models were used to predict timber volumes on all 518 sample plots.Model generation and volume prediction is consequently based on the same data set.This will lead to a prediction accuracy that is not realistic for practical wall-to-wall applications.However, these values are considered sufficient for comparison of the different data extraction approaches.

Modelling
In total, 22 metrics were derived for timber volume modelling (Section 2.4.1).Some of these metrics were highly correlated and a correlation analysis was conducted in order to avoid collinearity.Metrics that were correlated (Pearson) by more than 80% to each other were carefully examined and the metric that was most correlated with timber volume was kept.This analysis was conducted for all data sets and the results were combined in one initial set of metrics that were used for timber volume modelling.
Linear models were fitted to each data set.Linear models were chosen for all data sets to be able to compare the effect of transferring the concept of ACS to RS-assisted timber volume modelling.Nonlinearity in the data was handled by adding some of the predictor metrics additionally as squared terms to the models.The initial maximum models were stepwise reduced based on Akaike information criterion (AIC) using forward and backward selection.Furthermore, square root transformation of the response and weighted least squares were tested during modelling.For applying weights, it was necessary to account for plots where no pixels were selected and all metrics were 0. This was achieved by adding a small absolute term (0.01) to the denominator.This value is much lower than the measurement accuracy of heights within aerial image derived CHMs.

Validation
For each model, root mean square error percent (RMSE%) and bias% of model predictions were estimated for each fold in a 10-fold cross-validation as presented in Table 3.For each model, RMSE% and bias% values from each fold were averaged.Furthermore, standard deviation of bias% from each fold was calculated.
Several models with varying combinations of predictive metrics were tested for each data set and the model with lowest RMSE% was chosen as best model.These "best models" were used for comparing and assessing the performance of the different data extraction approaches.
These best performing models were used to predict timber volumes on all 518 sample plots.Model generation and volume prediction is consequently based on the same data set.This will lead to a prediction accuracy that is not realistic for practical wall-to-wall applications.However, these values are considered sufficient for comparison of the different data extraction approaches.Table 3. Equations for calculating Root Mean Square Error (RMSE), RMSE%, and bias%.In each equation, ŷi and y i represent predicted and observed timber volume for sample plot i, respectively, and n is the number of samples.

Metric
Equation

Results
The initial, uncorrelated variables used for modelling for each data set were mean, npix, cv, volout, and meanDTM.The predictive metrics used for the best performing models for each data set are listed in Table 4.The number of extracted pixels (npix) and the mean ground elevation (meanDTM) were selected for all data sets, indicating high robustness of these metrics for modelling timber volume.The metrics mean and cv, and the interaction mean and npix (mean × npix), were selected four of five times.Two metrics (volout and the squared mean (mean 2 )) were selected only once, each with one of the fixed-radius approaches.Table 4 also shows that including weights in the model only had a positive effect for the fixed-radius approaches.
1 An absolute term of 0.01 was added to mean in order to account for plots were no pixels were selected and mean = 0.
Table 5 displays the RMSE% achieved in a 10-fold cross-validation for all data sets.The values range between 41.57-42.43%,with both the lowest (tACS q1 ) and the highest (tACS q50 ) values achieved with the transferred ACS approach.The difference is less than one percentage point.Bias% varied around zero between negative and positive values for all data sets (Table 5), reflecting underestimation in high volume ranges and overestimation in low volume ranges, respectively (Figure 6).   Figure 6 shows the scatterplots of observed and predicted timber volume for the best performing models of all data sets using all plots.In all instances, there is a distinct roof-like limit for predictions at about 700 m 3 solid volume including bark per ha.This is least pronounced with the maxDist ind data set (Figure 6d).For high volumes, this results in severe underestimation.For some low volume plots, substantial overestimation can also be observed.This was already indicated in the values for bias% and is also reflected in Figure 7, which shows mean volume prediction for observed volume classes (class width: 100 m 3 /ha) for each data set.For these predictions, the same plots were used as for model generation.The mean timber volume of the NFI is also depicted, showing that the lower volume classes (100, 200, 300 m 3 /ha) are overestimated in all data sets, while the upper volume classes (600-1600 m 3 /ha) are underestimated.There were no significant differences between the predictions of the models (p = 1).
Figure 6 shows the scatterplots of observed and predicted timber volume for the best performing models of all data sets using all plots.In all instances, there is a distinct roof-like limit for predictions at about 700 m 3 solid volume including bark per ha.This is least pronounced with the maxDistind data set (Figure 6d).For high volumes, this results in severe underestimation.For some low volume plots, substantial overestimation can also be observed.This was already indicated in the values for bias% and is also reflected in Figure 7, which shows mean volume prediction for observed volume classes (class width: 100 m 3 /ha) for each data set.For these predictions, the same plots were used as for model generation.The mean timber volume of the NFI is also depicted, showing that the lower volume classes (100, 200, 300 m 3 /ha) are overestimated in all data sets, while the upper volume classes (600-1600 m 3 /ha) are underestimated.There were no significant differences between the predictions of the models (p = 1).

Discussion
This study was an initial investigation into a novel approach for integrating ACS and RS data for enhancing forest inventories.This new approach transfers the concept of ACS to the RS data extraction process for calculation of predictive metrics, aiming to better represent the plot design of ACS. Figure 3 shows that transferred ACS actually extracts data in irregular plot shapes and, especially in the cases of tACSq50 and tACSq10, meets the maximum measured distance of the terrestrial ACS data acquisition at the respective plot.The implementable fixed-radius approach (maxDistmed) extracts data using a circle where the radius is usually smaller or larger than the maximum measured distance (Figure 4b), resulting in data extraction areas that do not correspond to the actual area from which sample trees were selected.Deo et al. [4] suspected geometric inconsistency between RS and ACS field data as a source for decreased modelling precision, especially for old stands.It was therefore expected to improve timber volume modelling using the transferred ACS approach.However, the current implementation of this approach did not result in an improvement compared to the conventional fixed-radius plot approach.In fact, the achieved RMSE% after 10-fold crossvalidation (Table 5) differed by less than one percentage point and demonstrates that the transferred ACS and fixed-radius approaches led to predictions of similar quality.One explanation could be the missing information about understory sample trees in image-based RS data.This might influence both data extraction approaches in the same way and could indicate that tACS might actually be more successful when applied to ALS data, where understory information can be retrieved.However, further tests restricted to single layer stands will be necessary in order to verify this.Furthermore, comparing the height metrics of the different approaches directly reveals that the values of the respective plots vary notably.The most obvious difference occurs in the metric volin.Figure 9 exemplarily shows the relation between volin from fixed-radius plots (maxDistmed) to volin based on one variation of the tACS approach (tACS1q).The tACS1q volin metric increases exponentially compared to the maxDistmed volin metric.This relationship was discovered during the assessment of results in this study and it could not be sufficiently exploited in the volume estimation, yet.The form

Discussion
This study was an initial investigation into a novel approach for integrating ACS and RS data for enhancing forest inventories.This new approach transfers the concept of ACS to the RS data extraction process for calculation of predictive metrics, aiming to better represent the plot design of ACS. Figure 3 shows that transferred ACS actually extracts data in irregular plot shapes and, especially in the cases of tACS q50 and tACS q10 , meets the maximum measured distance of the terrestrial ACS data acquisition at the respective plot.The implementable fixed-radius approach (maxDist med ) extracts data using a circle where the radius is usually smaller or larger than the maximum measured distance (Figure 4b), resulting in data extraction areas that do not correspond to the actual area from which sample trees were selected.Deo et al. [4] suspected geometric inconsistency between RS and ACS field data as a source for decreased modelling precision, especially for old stands.It was therefore expected to improve timber volume modelling using the transferred ACS approach.However, the current implementation of this approach did not result in an improvement compared to the conventional fixed-radius plot approach.In fact, the achieved RMSE% after 10-fold cross-validation (Table 5) differed by less than one percentage point and demonstrates that the transferred ACS and fixed-radius approaches led to predictions of similar quality.One explanation could be the missing information about understory sample trees in image-based RS data.This might influence both data extraction approaches in the same way and could indicate that tACS might actually be more successful when applied to ALS data, where understory information can be retrieved.However, further tests restricted to single layer stands will be necessary in order to verify this.Furthermore, comparing the height metrics of the different approaches directly reveals that the values of the respective plots vary notably.The most obvious difference occurs in the metric volin.Figure 9 exemplarily shows the relation between volin from fixed-radius plots (maxDist med ) to volin based on one variation of the tACS approach (tACS 1q ).The tACS 1q volin metric increases exponentially compared to the maxDist med volin metric.This relationship was discovered during the assessment of results in this study and it could not be sufficiently exploited in the volume estimation, yet.The form of the exponential relationship suggests that it might be helpful in adjusting for the over-and underestimation of timber volume (Figure 6).This requires further investigation into the usability of the information contained in the relationship in Figure 9. of the exponential relationship suggests that it might be helpful in adjusting for the over-and underestimation of timber volume (Figure 6).This requires further investigation into the usability of the information contained in the relationship in Figure 9.With the current linear model used, all different approaches show a typical pattern of over-and underestimation of timber volume.This leads to two different interpretations.
1.The variation of volume within the ACS data cannot be explained by the height metrics used for modelling.2. The reference data and the explaining data do not match regarding their position and content.This pattern of over and underestimation was also observed in other studies where either photogrammetrically derived CHMs (e.g., [28]) or ALS-based CHMs (e.g., [9]) were used for forest timber volume modelling.Maack et al. [9] suspected a unevenly distribution of samples with regard to timber volume, with most samples being in the mid-range between 200 and 400 m 3 /ha.They also remark that the relationship between growth in height and growth in diameter is not linear and changes depending on the age of the trees.
Other influences such as positional accuracy of plot location, bias in the NFI data due to nondetection [29,30], or spatial and species specific variety of height-volume relation, could significantly influence the model accuracy.This could be the reason for all different methods leading to similar trends in overestimation of lower volume classes and underestimation of higher volume classes.Unfortunately, when the NFI plots were established, no assessment of positional accuracy was carried out.Due to the general uncertainties of GNSS positioning under canopy, even the GNSS devise's nominal accuracy is not a reliable indicator for the positional accuracy of NFI plots.Furthermore, the unknown shape and size of ACS plots hinders any reliable manual or automatic coregistration of ACS and RS data.This prevents a thorough assessment of the influence of the positional accuracy of terrestrial measurements on the model accuracy.
The highest potential for improvement is expected in adapting the DBH-height relationship utilised for CHM data extraction (Section 2.3.1).The DBH-height relationship was established using all NFI data for the entire state of Baden-Württemberg, comprising data from very different regions in terms of site quality and prevailing forest types.Adapting this relationship to the geographic region of the study site by using data from this region only might improve the selection of relevant With the current linear model used, all different approaches show a typical pattern of over-and underestimation of timber volume.This leads to two different interpretations.

1.
The variation of volume within the ACS data cannot be explained by the height metrics used for modelling.

2.
The reference data and the explaining data do not match regarding their position and content.
This pattern of over and underestimation was also observed in other studies where either photogrammetrically derived CHMs (e.g., [28]) or ALS-based CHMs (e.g., [9]) were used for forest timber volume modelling.Maack et al. [9] suspected a unevenly distribution of samples with regard to timber volume, with most samples being in the mid-range between 200 and 400 m 3 /ha.They also remark that the relationship between growth in height and growth in diameter is not linear and changes depending on the age of the trees.
Other influences such as positional accuracy of plot location, bias in the NFI data due to non-detection [29,30], or spatial and species specific variety of height-volume relation, could significantly influence the model accuracy.This could be the reason for all different methods leading to similar trends in overestimation of lower volume classes and underestimation of higher volume classes.Unfortunately, when the NFI plots were established, no assessment of positional accuracy was carried out.Due to the general uncertainties of GNSS positioning under canopy, even the GNSS devise's nominal accuracy is not a reliable indicator for the positional accuracy of NFI plots.Furthermore, the unknown shape and size of ACS plots hinders any reliable manual or automatic co-registration of ACS and RS data.This prevents a thorough assessment of the influence of the positional accuracy of terrestrial measurements on the model accuracy.
The highest potential for improvement is expected in adapting the DBH-height relationship utilised for CHM data extraction (Section 2.3.1).The DBH-height relationship was established using all NFI data for the entire state of Baden-Württemberg, comprising data from very different regions in terms of site quality and prevailing forest types.Adapting this relationship to the geographic region of the study site by using data from this region only might improve the selection of relevant pixels of the CHM.Further adaption of the DBH-height relationship should consider the major differences between conifers and broadleaf trees.Under more homogeneous conditions-for example, predominantly coniferous forests-overall model accuracy is expected to improve, as edge effects and growing differences between tree species are reduced.For heterogeneous forests, separate functions of DBH-height relationship for conifers and broadleaves is expected to cause a similar effect and notably improve data extraction.Data could be extracted based on the dominant, plot-based tree type: conifers, broadleaves, and mixed.However, this requires a high-quality tree type classification.This classification can theoretically be achieved using the spectral information of the aerial images.Generally, including suitable spectral metrics in the model should improve prediction accuracy.However, aerial images suffer from changing illumination conditions between, as well as within, images [31,32].Both affect the reliability and robustness of spectral metrics derived from aerial images, which makes them not well suited for forest variable prediction, unless the effect of changing illumination can be compensated.
Another possibility for improvement could be to use transferred ACS not for extracting single pixels from the CHM, but for defining an ideal plot radius for any plot position.The basic idea would be to use the DBH-height relationship to identify the pixel that is farthest away from the plot centre, while still indicating a tree that is theoretically an ACS sample tree.The distance between this pixel and the plot centre then defines the respective plot radius.This approach would be comparable to the approach using the individual measured maximum distance (maxDist ind ), but would actually be usable in practical applications.

Conclusions
The described approach of tACS is a first step in adapting the information from airborne derived height information to terrestrial ACS.Despite the similar results of the tested methods, one advantage of the new approach is that it does not require preliminary investigations for finding a best suited plot radius for ACS plots, as was the case in [9][10][11], or extracting large amounts of data, as in [13].Furthermore, the results presented here are based on some initial considerations towards enhancing integration of ACS field data and RS data.After all, the results did not worsen and there is still potential for further improvement of the approach.
The presented approach of tACS will be further developed, aiming to enhance modelling results.An investigation into whether all actual sample trees of the terrestrial ACS data acquisition were covered by extracted pixels is expected to support identification of the most promising alterations to the data extraction algorithm and modelling approach.Furthermore, a test regarding temporal and spatial robustness of the tACS metrics and models is planned to be carried out.Another interesting point for future research could be the comparison between the tACS approach and other approaches that seek to combine ACS and RS data, such as truncated ACS e.g., [6] and finding suitable fixed plot radii e.g., [10,11,13].

Figure 1 .
Figure 1.Location and extent of study site: (a) Location of study site in Germany; (b) Extent of study site and distribution of national forest inventory (NFI) tracts superimposed on colour-infrared (CIR) orthophotos; (c) Details of NFI tract layout with four tract corners at each tract that establish NFI plot centres superimposed on CIR orthophotos.

Figure 1 .
Figure 1.Location and extent of study site: (a) Location of study site in Germany; (b) Extent of study site and distribution of national forest inventory (NFI) tracts superimposed on colour-infrared (CIR) orthophotos; (c) Details of NFI tract layout with four tract corners at each tract that establish NFI plot centres superimposed on CIR orthophotos.

Figure 2 .
Figure 2. Result of quantile regression analysis of DBH-height relationship using all NFI (BWI3) sample tree data for Baden-Württemberg.

Figure 2 .
Figure 2. Result of quantile regression analysis of DBH-height relationship using all NFI (BWI3) sample tree data for Baden-Württemberg.

ForestsFigure 3 .
Figure 3. Example of selected pixels (blue crosses) in three different plots for each transfer angle-count sampling (tACS) data set.Background image is the CHM, the outer black circle the 30 m search radius, the red circle the terrestrially measured maximum distance between plot centre an sample trees (maxDistind) and the inner black circle the median of maximum distances (maxDistmed = 11.7 m) for reference.(a) tACSq50; (b) tACSq10; (c) tACSq1.

Figure 3 .
Figure 3. Example of selected pixels (blue crosses) in three different plots for each transfer angle-count sampling (tACS) data set.Background image is the CHM, the outer black circle the 30 m search radius, the red circle the terrestrially measured maximum distance between plot centre an sample trees (maxDist ind ) and the inner black circle the median of maximum distances (maxDist med = 11.7 m) for reference.(a) tACS q50 ; (b) tACS q10 ; (c) tACS q1 .

Table 2 .Figure 4 .
Figure 4. Example of selected pixels (blue crosses) in three different plots for each fixed-radius data set.Background image is the canopy height model (CHM), the outer black circle is the 30 m search radius from the tACS data sets, the red circle is the terrestrially measured maximum distance between plot centre an sample trees (maxDistind), and the inner black circle is the median of maximum distances (maxDistmed = 11.7 m) for reference: (a) maxDistind; (b) maxDistmed.

Figure 4 .
Figure 4. Example of selected pixels (blue crosses) in three different plots for each fixed-radius data set.Background image is the canopy height model (CHM), the outer black circle is the 30 m search radius from the tACS data sets, the red circle is the terrestrially measured maximum distance between plot centre an sample trees (maxDist ind ), and the inner black circle is the median of maximum distances (maxDist med = 11.7 m) for reference: (a) maxDist ind ; (b) maxDist med .

Forests 2017, 8 , 238 10 of 18 Figure 5 .
Figure 5. Schematic description of inner volume (volin) and outer volume (volout).The grey area, which represents space over vegetation of less than 6 m height, is excluded from any volume calculation.

Figure 5 .
Figure 5. Schematic description of inner volume (volin) and outer volume (volout).The grey area, which represents space over vegetation of less than 6 m height, is excluded from any volume calculation.

Figure 7 .
Figure 7. Mean volume prediction for observed volume classes (categories) for each data set utilising all plots, as well as the mean volume terrestrially measured at all NFI plots.

Figure 8
Figure 8 depicts the absolute RMSE achieved with each data set in different volume classes (observed values).These values were calculated using all plots.The RMSE in the low volume classes (100-400 m 3 /ha) reaches values up to 200 m 3 /ha.It is lowest for the medium volume classes (500 and 600 m 3 /ha) and rapidly increases from approximately 150 m 3 /ha in the 700 m 3 /ha class, to 650 m 3 /ha in the 1600 m 3 /ha class.There is no noteworthy difference between the different data sets.

Figure 7 .
Figure 7. Mean volume prediction for observed volume classes (categories) for each data set utilising all plots, as well as the mean volume terrestrially measured at all NFI plots.

Figure 8
Figure 8 depicts the absolute RMSE achieved with each data set in different volume classes (observed values).These values were calculated using all plots.The RMSE in the low volume classes (100-400 m 3 /ha) reaches values up to 200 m 3 /ha.It is lowest for the medium volume classes (500 and 600 m 3 /ha) and rapidly increases from approximately 150 m 3 /ha in the 700 m 3 /ha class, to 650 m 3 /ha in the 1600 m 3 /ha class.There is no noteworthy difference between the different data sets.

Figure 8 .
Figure 8. Root Mean Square Error (RMSE) achieved with each data set in different volume classes (categories) (observed values) utilising all plots.

Figure 8 .
Figure 8. Root Mean Square Error (RMSE) achieved with each data set in different volume classes (observed values) utilising all plots.

Table 2 .
Data set resulting from the differing data extraction approaches and their variations.
complex transfer ACS; 10%-quantile for DBH-height relationship tACS q1 complex transfer ACS; 1%-quantile for DBH-height relationship maxDist ind in general circular plot radius = max.distance; min.height 6 m maxDist med in general circular plot radius = median max.distance; min.height 6 m

Table 4 .
Predictive metrics selected in best performing model for each data set (mean 2 is the squared mean and mean × npix is the interaction between metrics mean and npix).

Table 5 .
Mean of Root Mean Square Error percent (RMSE%) and mean and standard deviation (sd) of bias% for each data set achieved in 10-fold cross-validation.