Open Access
This article is

- freely available
- re-usable

*Remote Sens.*
**2018**,
*10*(7),
1052;
https://doi.org/10.3390/rs10071052

Article

A Double-Sampling Extension of the German National Forest Inventory for Design-Based Small Area Estimation on Forest District Levels

^{1}

Department of Environmental Systems Science, ETH Zurich, 8092 Zurich, Switzerland

^{2}

State Forest Service Rhineland-Palatinate, 56281 Emmelshausen, Germany

^{*}

Author to whom correspondence should be addressed.

Received: 17 May 2018 / Accepted: 26 June 2018 / Published: 3 July 2018

## Abstract

**:**

The German National Forest Inventory consists of a systematic grid of permanent sample plots and provides a reliable evidence-based assessment of the state and the development of Germany’s forests on national and federal state level in a 10 year interval. However, the data have yet been scarcely used for estimation on smaller management levels such as forest districts due to insufficient sample sizes within the area of interests and the implied large estimation errors. In this study, we present a double-sampling extension to the existing German National Forest Inventory (NFI) that allows for the application of recently developed design-based small area regression estimators. We illustrate the implementation of the estimation procedure and evaluate its potential for future large-scale operational application by the example of timber volume estimation on two small-scale management levels (45 and 405 forest district units respectively) over the entire area of the federal German state of Rhineland-Palatinate. An airborne laserscanning (ALS) derived canopy height model and a tree species classification map based on satellite data were used as auxiliary data in an ordinary least square regression model to produce the timber volume predictions. The results support that the suggested double-sampling procedure can substantially increase estimation precision on both management levels: the two-phase estimators were able to reduce the variance of the one-phase simple random sampling estimator by 43% and 25% on average for the two management levels respectively.

Keywords:

National Forest Inventory; small area estimation; forest districts; double sampling for regression within strata; cluster sampling; canopy height model; tree species classification## 1. Introduction

The German National Forest Inventory (NFI) provides reliable evidence-based and accurate information of the current state and the development of Germany’s forest over time. The NFI thereby has the responsibility to satisfy various information needs including reporting to public and state forestry administrations, wood-based industries and the public on the national level, as well as to the Food and Agriculture Organization of the United Nations (FAO) and to the United Nations Framework Convention on Climate Change (UNFCC) on the international level [1]. The current design of the German NFI rests solely upon a terrestrial cluster inventory that is carried out at sample locations systematically distributed over the entire forested area of Germany. In order to cover a large area of 114,191 km${}^{2}$ [2], the sample size has been specifically chosen to satisfy high estimation accuracies for forest attributes on the national and federal state levels. However, sample sizes often drop dramatically when entering spatial units below the federal state level. This is particularly true for forest management levels such as forest districts for which the estimation uncertainties turn out to be unacceptably large due to the very limited number of sample plots within these units. For this reason, the German NFI data have not yet been extensively incorporated into operational planning on forest district management levels. In most German federal states, management strategies are thus still based on expert judgements from time-consuming standwise forest inventories (SFI), which are prone to systematic deviations [3] and do not provide any measure of uncertainty.

Some German federal states, such as Lower Saxony, have approached this problem by establishing a regional Forest District Inventory (FDI) carried out for forests owned by the state forest enterprise with a much higher sampling density than used by the NFI in order to scientifically base their regional management strategies on quantitative and accurate information [4]. However, such FDIs are cost-intensive and, facing increasing restrictions in budget and staff resources, there has been a need for more cost-efficient inventory methods [5,6]. One method which has proven to be efficient is double- or two-phase sampling [7,8,9,10,11,12]. Double-sampling incorporates less expensive auxiliary information and can be used to either increase estimation precision under a fixed terrestrial sample size, or maintain estimation precision under reduced terrestrial sample size.

Double-sampling procedures have already been used in various countries such as Canada [13], the USA [14], Switzerland [15], Italy [16] and Germany [17]. Recent studies from Grafström et al. [18] illustrated how to use the auxiliary information to determine optimised balanced terrestrial sample designs. Double-sampling has also been extended to triple-sampling estimation methods using auxiliary information derived at two different sampling intensities. An example can be found in von Lüpke et al. [5] who illustrated an extension of the existing two-phase FDI of Lower Saxony to a three-phase design that uses updates of past inventory data as additional auxiliary information and allows for a significant reduction of the terrestrial sample size in intermediate inventories. Another example is Massey et al. [19] who developed a triple-sampling extension based on the ideas of Mandallaz [20] for the Swiss NFI that can significantly reduce the increase in estimation uncertainty caused by the new annual inventory design.

Two-phase and three-phase samplings techniques have also been applied to small area estimation (SAE). SAE techniques address the situation where the number of samples within a subunit, or small area (SA), of the entire sampling frame is too small to provide reliable estimates for that unit. A broad range of SA estimators used in forest inventories [11] originally comes from official statistics. One such method that is commonly applied is known as indirect estimation [21], where statistical models are used to convert auxiliary information into predictions of the target variable that is rarely or not observed in the small area. These models are trained using data from outside the small area in order to “borrow strength” from areas where information is available. Of numerous applications of SAE in forestry [22,23,24,25], most use unit-level models, i.e., the inventory plot is the unit of the response variable in the training data used for the model fit. Such unit-level models have been intensively investigated for timber volume estimation using various remote sensing auxiliary data [26,27]. Other studies have investigated area-level models, where the auxiliary information is only provided on the SA level [28]. Some studies have illustrated that even NFI data derived under low sampling densities can still be used to provide acceptable precision of small area estimates on much smaller management levels. One example is Breidenbach and Astrup [22] who used data from the Norwegian NFI to make small area estimation for standing timber volume for 14 municipalities where the number of NFI samples within these areas were between 1 and 35. The estimation errors under the applied model-dependent and design-based small area estimators turned to be markedly smaller than under the standard one-phase estimator. Another example is Magnussen et al. [29] who recently used the Swiss NFI data to estimate timber volume within 108 Swiss forest districts with sample sizes between 9 and 206. Similar studies using German NFI data for small area estimation have been lacking.

The objective of this study was to investigate whether the application of latest design-based small area estimation methods allow to use the German NFI data to produce estimates of acceptable precision on two forest district levels. The methods were tested in the German federal state Rhineland-Palatinate. Three types of model-assisted design-based small area regression estimators were used to derive point and variance estimates of mean standing timber volume for 45 and 405 forest management units on the two respective district levels. The SA-estimators we considered were the pseudo-small, extended pseudo-synthetic and the pseudo-synthetic design-based small area estimator suggested by Mandallaz [30] and Mandallaz et al. [25]. Auxiliary data consisted of a canopy height model (CHM) obtained from a countrywide airborne laser scanning (ALS) and a tree species classification map to be used for regression within tree species strata. The estimation precisions were compared to those obtained by the standard one-phase estimator for cluster sampling under simple random sampling. The chosen double-sampling estimators were selected for several reasons: (i) the design-based framework relaxes dependencies on the regression model assumptions which seemed appropriate facing severe quality restrictions in the ALS data; (ii) the estimators can be used with non-exhaustive, i.e., non wall-to-wall, auxiliary information; (iii) all estimators are explicitly formulated for cluster sampling which has not yet been the case for frequently used model-dependent estimators and (iv) the asymptotically unbiased g-weight variance partially accounts for estimating the regression coefficients on the same sample used for estimation (internal model approach) and is also robust under heteroscedasticity of the model residuals. The results from this study were considered to provide valuable information about the potential of the suggested small area estimation procedure and the incorporated auxiliary information for future operational large scale application.

## 2. Terrestrial Sampling Design of the German NFI

The German NFI (Bundeswaldinventur, BWI) is a periodic inventory that is carried out every 10 years over the entire forest area of Germany. The third and most recent inventory (BWI3) was conducted in 2011 and 2012. While information was originally gathered on a systematic 4 × 4 km grid, some federal states such as Rhineland-Palatinate have switched to a densified 2 × 2 km grid. The German NFI uses a cluster sampling design, which means that a sample unit consists of at most four sample locations (also referred to as sample plots) that are arranged in a square, called cluster, with a side length of 150 metres. The number of plots per cluster can vary between 1 and 4 depending on forest/non-forest decisions by the field crews on the individual plot level [31]. In the field survey of the BWI3, sample trees for timber volume estimation are selected according to the angle count sampling technique [32]. Angle count sampling is a visual selection procedure conducted via an optical instrument (Relascope). At a sample point x, a tree is selected as a sample tree if its apparent diameter at the height of 1.3 m ($DBH$) and the associated angle ${\alpha}_{i}$ (in radiant) appears larger than a limit angle $\alpha $ when viewed through the Relascope [12]. The limit angle $\alpha $ can be expressed by the so-called basal area factor ($BAF$), i.e., $BAF={10}^{2}si{n}^{2}\left(\frac{\alpha}{2}\right)$. Consequently, each tree i is assigned with an individual radius ${R}_{i}=\frac{DB{H}_{i}}{2\sqrt{BAF}}$ which is the distance between the tree location and the sample plot center. A tree is selected if ${R}_{i}$ is equal to or smaller than the so-called limiting distance R that results from chosen BAF (i.e., $\alpha $). One characteristic of angle count sampling is that a sample plot does no longer has a fixed radius in which sample trees are selected, but each tree creates its individual radius ${R}_{i}$ that is—likewise the associated inclusion probability—proportional to its basal area. For this reason, the angle count method is also referred to as variable radius plot sampling and implements the probability proportional to size (PPS) concept. The BWI3 uses a basal area factor of 4 that is respectively adjusted for sample trees at the forest boundary by a geometric intersection of the boundary transect with the individual tree’s inclusion circle [31]. A further inventory threshold for a tree to be recorded is a DBH of at least 7 cm. For each sample tree that is selected by this procedure, the DBH, the absolute tree height, the tree diameter at 7 m ($D7$) and the tree species is measured and used to estimate the volume at the tree level. These volume estimates are based on the application of tree species specific taper curves that are adjusted to the set of diameters and corresponding height measurements taken from the respective sample tree [33].

## 3. Double Sampling in the Infinite Population Approach

The estimators used in this study have been proposed by [25,30] and derive their mathematical properties under the so-called infinite population approach. Therefore, we shall first provide a short introduction into this general estimation framework. We start by assuming that the population P of trees $i\in 1,2,...,N$ within a forest of interest F is exactly defined, and each tree i has a response variable ${Y}_{i}$ (e.g., its timber volume) that can be used to define the population mean Y (e.g., the average timber volume per unit area) over F. Since a full census of all tree population individuals is almost never feasible, Y has to be estimated based on a sample. In the infinite population approach this sample is a set of points or locations x distributed independently and uniformly over the set of all possible points in F. Each point x has an associated local density $Y\left(x\right)$ (e.g., the timber volume per unit area) whose spatial distribution is given by a fixed (i.e., non stochastic) piecewise constant function. The population mean Y is mathematically equivalent to the integral of the local density function surface divided by the surface area of F, $\lambda \left(F\right)$, i.e., $Y=\frac{1}{N}{\sum}_{i=1}^{N}{Y}_{i}=\frac{1}{\lambda \left(F\right)}{\int}_{F}Y\left(x\right)dx$, and thus the population mean Y corresponds to a spatial mean. Since the actual local density function is unobserved in its entirety, one estimates Y by taking a sample ${s}_{2}$ consisting of ${n}_{2}$ points and measuring each of their respective local densities. This sampling procedure is often referred to as one-phase sampling (OPS) and ${s}_{2}$ is referred to as the terrestrial inventory. In contrast to the one-phase approach, two-phase or double-sampling procedures use information from two nested samples (phases). Practically speaking, the terrestrial inventory ${s}_{2}$ is embedded in a large phase ${s}_{1}$ comprising ${n}_{1}$ sample locations that each provide a set of explanatory variables described by the column vector $\mathit{Z}\left(x\right)={(z{\left(x\right)}_{1},z{\left(x\right)}_{2},\dots ,z{\left(x\right)}_{p})}^{\top}$ at each point $x\in {s}_{1}$. These explanatory variables are derived from auxiliary information that is available in high quantity within the forest F. For every $x\in {s}_{1}$, $\mathit{Z}\left(x\right)$ is transformed into a prediction $\widehat{Y}\left(x\right)$ of $Y\left(x\right)$ using the choice of some prediction model. The basic idea of this method is to boost the sample size by providing a large sample of less precise but cheaper predictions of $Y\left(x\right)$ in ${s}_{1}$ and to correct any possible model bias, i.e., $\mathbb{E}(Y\left(x\right)-\widehat{Y}\left(x\right))$, using the subsample of terrestrial inventory units where the value of $Y\left(x\right)$ is observed. In this context, it is also important to note that the response and auxiliary variables are assumed to be error-free and the resulting errors for the point estimates reflect only the uncertainty due to sampling.

## 4. Estimators

#### 4.1. Design-Based One-Phase Estimator for Cluster Sampling (SRS)

The one-phase estimator for cluster sampling (SRS) constitutes the status quo that is currently applied under the existing one-phase sampling design of the German NFI in order to obtain point and variance estimates for the mean timber volume of a given estimation unit. In order to provide all estimators in the infinite population framework and ensure a consistent terminology with the two-phase estimators in Section 4.2, we will introduce the SRS estimator that is applied in the BWI3 algorithms [34] in the form given in Mandallaz [12] and Mandallaz et al. [35].

In order to calculate the local density ${Y}_{c}\left(x\right)$ at the cluster level, a cluster is defined as consisting of M sample locations (in the BWI3, we have $M=4$) where $M-1$ sample locations ${x}_{2},\dots ,{x}_{M}$ are created close to the cluster origin ${x}_{1}$ by adding a fixed set of spatial vectors ${e}_{2},\dots ,{e}_{M}$ to ${x}_{1}$. The actual number of plots per cluster, $M\left(x\right)$, is a random variable due to the uniform distribution of ${x}_{l}$ ($l=1,\dots ,M$) in the forest F and to the forest/non-forest decision for each sample location ${x}_{l}$:

$$M\left(x\right)=\sum _{l=1}^{M}{I}_{F}\left({x}_{l}\right)\phantom{\rule{12.75pt}{0ex}}\mathrm{where}\phantom{\rule{12.75pt}{0ex}}{I}_{F}\left({x}_{l}\right)=\left\{\begin{array}{c}1\phantom{\rule{4.pt}{0ex}}\mathrm{if}\phantom{\rule{4.pt}{0ex}}{x}_{l}\in F\hfill \\ 0\phantom{\rule{4.pt}{0ex}}\mathrm{if}\phantom{\rule{4.pt}{0ex}}{x}_{l}\notin F\hfill \end{array}\right.$$

The local density on cluster level ${Y}_{c}\left(x\right)$, which is in our case the timber volume per hectare, is then defined as the average of the individual sample plot densities $Y\left({x}_{l}\right)$:

$${Y}_{c}\left(x\right)=\frac{{\sum}_{l=1}^{M}{I}_{F}\left({x}_{l}\right)Y\left({x}_{l}\right)}{M\left(x\right)}$$

The local density $Y\left({x}_{l}\right)$ on individual sample plot level was calculated according to the description in Mandallaz [12], which can be rewritten for angle-count sampling technique applied in the BWI3. The general form of $Y\left(x\right)$ in Mandallaz [12] is given as the Horwitz-Thompson estimator
where ${Y}_{i}$ is in our case the timber volume of the tree i recorded at sample location x in m${}^{3}$. Each tree has an inclusion probability ${\pi}_{i}$ that is well defined as the proportion of its inclusion circle area $\lambda \left({K}_{i}\right)$ within the forest area $\lambda \left(F\right)$, i.e., via their geometric intersection:

$$Y\left({x}_{l}\right)=\sum _{i\in {s}_{2}\left({x}_{l}\right)}\frac{{Y}_{i}}{{\pi}_{i}\lambda \left(F\right)}$$

$${\pi}_{i}=\frac{\lambda ({K}_{i}\cap F)}{\lambda \left(F\right)}$$

The radius ${R}_{i}$ of the tree’s inclusion circle ${K}_{i}$ is given by ${R}_{i}=DB{H}_{i}/c{f}_{i,corr}$ (also referred to as limiting distance), where $c{f}_{i,corr}$ is the original counting factor $cf$ corrected for potential boundary effects at the forest border. In case of angle-count sampling, we can rewrite ${\pi}_{i}$ as

$${\pi}_{i}=\frac{{G}_{i}}{c{f}_{i,corr}\lambda \left(F\right)}$$

Since the intersection area $\lambda ({K}_{i}\cap F)/\lambda \left(F\right)$ can be expressed using the trees basal area ${G}_{i}$ (in m${}^{2}$) and the corrected counting factor:

$$\lambda ({K}_{i}\cap F)=\frac{{G}_{i}}{c{f}_{i,corr}}\phantom{\rule{12.75pt}{0ex}}\mathrm{where}\phantom{\rule{12.75pt}{0ex}}c{f}_{i,corr}=cf\frac{\lambda \left({K}_{i}\right)}{\lambda ({K}_{i}\cap F)}$$

Equation (5) in Equation (3) yields the rewritten form of $Y\left({x}_{l}\right)$ for angle count sampling that conforms to the definition used in the BWI3 algorithms [34]:
where $nh{a}_{i}$ is the number of trees per hectare represented by tree i. The local densities on cluster level can then be used to derive the estimated spatial mean ${\widehat{\overline{Y}}}_{c}$ and its estimated variance $\widehat{\mathbb{V}}\left({\widehat{\overline{Y}}}_{c}\right)$ for any given spatial unit for which ${n}_{2}\ge 2$ (${n}_{2}$ denoting the number of clusters):
with $\overline{{M}_{2}}=\frac{{\sum}_{x\in {s}_{2}}M\left(x\right)}{{n}_{2}}$.

$$Y\left({x}_{l}\right)=\sum _{i\in {s}_{2}\left({x}_{l}\right)}\frac{c{f}_{i,corr}{Y}_{i}}{{G}_{i}}=\sum _{i\in {s}_{2}\left({x}_{l}\right)}nh{a}_{i}{Y}_{i}$$

$$\begin{array}{cc}\hfill {\widehat{\overline{Y}}}_{c}& =\frac{{\sum}_{x\in {s}_{2}}M\left(x\right){Y}_{c}\left(x\right)}{{\sum}_{x\in {s}_{2}}M\left(x\right)}\hfill \end{array}$$

$$\begin{array}{cc}\hfill \widehat{\mathbb{V}}\left({\widehat{\overline{Y}}}_{c}\right)& =\frac{1}{{n}_{2}({n}_{2}-1)}\sum _{x\in {s}_{2}}{\left(\frac{M\left(x\right)}{\overline{{M}_{2}}}\right)}^{2}{({Y}_{c}\left(x\right)-{\widehat{\overline{Y}}}_{c})}^{2}\hfill \end{array}$$

#### 4.2. Design-Based Small Area Regression Estimators for Cluster Sampling

All three considered small area estimators use ordinary least square (OLS) regression models to produce predictions of the local density ${Y}_{c}\left(x\right)$ directly on the cluster level c. We consider the internal model approach, where the estimators take into account that the regression coefficients on the cluster level were fitted using the same sample used for estimation. To apply this to small area estimation, the vector of estimated regression coefficients on the cluster level is found by “borrowing strength” from the entire terrestrial sample ${s}_{2}$ of the current inventory:

$$\begin{array}{cc}\hfill {\widehat{\mathit{\beta}}}_{c,{s}_{2}}& ={\mathit{A}}_{c,{s}_{2}}^{-1}\left(\frac{1}{{n}_{2}}\sum _{x\in {s}_{2}}M\left(x\right){Y}_{c}\left(x\right){\mathit{Z}}_{c}\left(x\right)\right)\hfill \end{array}$$

$$\begin{array}{cc}\hfill {\mathit{A}}_{c,{s}_{2}}& =\frac{1}{{n}_{2}}\sum _{x\in {s}_{2}}M\left(x\right){\mathit{Z}}_{c}\left(x\right){\mathit{Z}}_{c}^{\top}\left(x\right)\hfill \end{array}$$

${\mathit{Z}}_{c}\left(x\right)$ is the column vector of explanatory variables on the cluster level, which is calculated as the weighted average of the explanatory variables $\mathit{Z}\left({x}_{l}\right)$ on the individual plot levels ${x}_{1},\dots ,{x}_{l}$ (Equation (10)). The weight $w\left({x}_{l}\right)$ is the proportion of the extraction area (support) within the forest F used to derive the explanatory variables from the raw auxiliary information.

$${\mathit{Z}}_{c}\left(x\right)=\frac{{\sum}_{l=1}^{M}{I}_{F}\left({x}_{l}\right)w\left({x}_{l}\right)\mathit{Z}\left({x}_{l}\right)}{{\sum}_{l=1}^{M}{I}_{F}\left({x}_{l}\right)w\left({x}_{l}\right)}$$

The estimated design-based variance-covariance matrix ${\widehat{\Sigma}}_{{\widehat{\mathit{\beta}}}_{c,{s}_{2}}}$ accounts for the fact that the regression model is internal and reflects the sampling variability that occurs when estimating the regression coefficients on the realized sample ${s}_{2}$. It is defined as
with
being the empirical model residuals at the cluster level. By construction of OLS, the empirical model residuals satisfy that they average to zero, i.e., $\frac{{\sum}_{x\in {s}_{2}}M\left(x\right){\widehat{R}}_{c}\left(x\right)}{{\sum}_{x\in {s}_{2}}M\left(x\right)}=0$, if ${\mathit{Z}}_{c}\left(x\right)$ contains the intercept term 1. Because the model is fitted internally, i.e., using the terrestrial sample ${s}_{2}$ of the inventory area, this leads to the important zero mean residual property of the theoretical model residuals over the inventory domain F, i.e., ${\int}_{F}{R}_{c}\left(x\right)dx$. Mandallaz [30] and Mandallaz et al. [25] used this property particular for the case of small area estimation to derive better variance estimates based on the g-weight technique adapted from the works of Särndal et al. [9].

$${\widehat{\Sigma}}_{{\widehat{\mathit{\beta}}}_{c,{s}_{2}}}={\mathit{A}}_{c,{s}_{2}}^{-1}\left(\frac{1}{{n}_{2}^{2}}\sum _{x\in {s}_{2}}{M}^{2}\left(x\right){\widehat{R}}_{c}^{2}\left(x\right){\mathit{Z}}_{c}\left(x\right){\mathit{Z}}_{c}^{\top}\left(x\right)\right){\mathit{A}}_{c,{s}_{2}}^{-1}$$

$${\widehat{R}}_{c}={Y}_{c}\left(x\right)-{\mathit{Z}}_{c}^{\top}\left(x\right){\widehat{\mathit{\beta}}}_{c,{s}_{2}}={Y}_{c}\left(x\right)-{\widehat{Y}}_{c}\left(x\right)$$

In the following, we will give a short description of each small area estimator and refer to Mandallaz et al. [25], Mandallaz [30], Mandallaz et al. [35] if the reader requires additional mathematical details or proofs. The estimators have also been implemented in the open-source software-package forestinventory [36] in the statistical software R [37] which was used to compute all estimates in this study.

#### 4.2.1. Pseudo Small Area Estimator (PSMALL)

All point information used for small area estimation is now restricted to that available at the sample locations ${s}_{1,G}$ or ${s}_{2,G}$ in the small area G, with exception of ${\widehat{\mathit{\beta}}}_{c,{s}_{2}}$ and ${\widehat{\Sigma}}_{{\widehat{\mathit{\beta}}}_{c,{s}_{2}}}$ which are always based on the entire sample ${s}_{2}$. We thus first define the following quantities on the small area level:

$$\begin{array}{ccc}\hfill {\widehat{\overline{\mathit{Z}}}}_{c,G}& =\frac{{\sum}_{x\in {s}_{1,G}}{M}_{G}\left(x\right){\mathit{Z}}_{c,G}\left(x\right)}{{\sum}_{x\in {s}_{1,G}}{M}_{G}\left(x\right)}\phantom{\rule{12.75pt}{0ex}}\hfill & \hfill \mathrm{where}\phantom{\rule{12.75pt}{0ex}}{\mathit{Z}}_{c,G}\left(x\right)=\frac{{\sum}_{l=1}^{M}{I}_{G}\left({x}_{l}\right)\mathit{Z}\left({x}_{l}\right)}{{M}_{G}\left(x\right)}\end{array}$$

$$\begin{array}{ccc}\hfill {Y}_{c,G}\left(x\right)& =\frac{{\sum}_{l=1}^{M}{I}_{G}\left({x}_{l}\right)Y\left({x}_{l}\right)}{{M}_{G}\left(x\right)}\hfill & \hfill \mathrm{and}\phantom{\rule{12.75pt}{0ex}}{\widehat{Y}}_{c,G}\left(x\right)={\widehat{\overline{\mathit{Z}}}}_{c,G}^{\top}{\widehat{\mathit{\beta}}}_{c,{s}_{2}}\end{array}$$

$$\begin{array}{ccc}\hfill {\overline{\widehat{R}}}_{2,G}& =\frac{{\sum}_{x\in {s}_{2,G}}{M}_{G}\left(x\right){\widehat{R}}_{c,G}\left(x\right)}{{\sum}_{x\in {s}_{2,G}}{M}_{G}\left(x\right)}\phantom{\rule{12.75pt}{0ex}}\hfill & \hfill \mathrm{where}\phantom{\rule{12.75pt}{0ex}}{\widehat{R}}_{c,G}\left(x\right)={Y}_{c,G}\left(x\right)-{\widehat{Y}}_{c,G}\left(x\right)\end{array}$$

Note that the restriction to G,i.e., ${I}_{G}\left({x}_{l}\right)=\{0,1\}$, is made on the individual sample plot level ${x}_{l}$, and ${M}_{G}\left(x\right)={\sum}_{l=1}^{M}{I}_{G}\left({x}_{l}\right)$ thus is the number of sample plots per cluster within the small area. The asymptotically design-unbiased point estimate of PSMALL is then defined according to Equation (14a). The first term estimates the small area population mean of G by applying the globally derived regression coefficients to the small area cluster means of the explanatory variables ${\widehat{\overline{\mathit{Z}}}}_{c,G}$. The second term then corrects for a potential bias of the regression model predictions in the small area G by adding the mean of the empirical residuals ${\overline{\widehat{R}}}_{2,G}$ in G. This correction is necessary because the zero mean residual property that holds in F is not guaranteed to hold in small area G under this construction.
with ${\overline{M}}_{2,G}=\frac{{\sum}_{x\in {s}_{2,G}}{M}_{G}\left(x\right)}{{n}_{2,G}}$.

$$\begin{array}{cc}\hfill {\widehat{Y}}_{c,G,PSMALL}& ={\widehat{\overline{\mathit{Z}}}}_{c,G}^{\top}{\widehat{\mathit{\beta}}}_{c,{s}_{2}}+{\overline{\widehat{R}}}_{2,G}\hfill \end{array}$$

$$\begin{array}{cc}\hfill \widehat{\mathbb{V}}\left({\widehat{Y}}_{c,G,PSMALL}\right)& ={\widehat{\overline{\mathit{Z}}}}_{c,G}^{\top}{\widehat{\Sigma}}_{{\widehat{\mathit{\beta}}}_{c,{s}_{2}}}{\widehat{\overline{\mathit{Z}}}}_{c,G}+{\widehat{\mathit{\beta}}}_{c,{s}_{2}}^{\top}{\widehat{\Sigma}}_{{\widehat{\overline{\mathit{Z}}}}_{c,G}}{\widehat{\mathit{\beta}}}_{c,{s}_{2}}\hfill \\ & +\frac{1}{{n}_{2,G}({n}_{2,G}-1)}\sum _{x\in {s}_{2,G}}{\left(\frac{{M}_{G}\left(x\right)}{{\overline{M}}_{2,G}}\right)}^{2}{({\widehat{R}}_{c,G}\left(x\right)-{\overline{\widehat{R}}}_{2,G})}^{2}\hfill \end{array}$$

The variance-covariance matrix of the auxiliary vector ${\widehat{\Sigma}}_{{\widehat{\overline{\mathit{Z}}}}_{c,G}}$ is thereby defined as
with ${\overline{M}}_{1,G}=\frac{{\sum}_{x\in {s}_{1,G}}{M}_{G}\left(x\right)}{{n}_{1,G}}$.

$${\widehat{\Sigma}}_{{\widehat{\overline{\mathit{Z}}}}_{c,G}}=\frac{1}{{n}_{1,G}({n}_{1,G}-1)}\sum _{x\in {s}_{1,G}}{\left(\frac{{M}_{G}\left(x\right)}{{\overline{M}}_{1,G}}\right)}^{2}({\mathit{Z}}_{c,G}\left(x\right)-{\widehat{\overline{\mathit{Z}}}}_{c,G}){({\mathit{Z}}_{c,G}\left(x\right)-{\widehat{\overline{\mathit{Z}}}}_{c,G})}^{\top}$$

The estimated design-based variance of ${\widehat{Y}}_{c,G,PSMALL}$ is given by Equation (14b). Basically, the first term constitutes the variance introduced by the uncertainty in the regression coefficients, whereas the second term expresses the variance caused by estimating the exact auxiliary mean in G using a non-exhaustive sample ${s}_{1,G}$. The third term is the variance of the model residuals and thus accounts for the inaccuracies of the model predictions. Note that the first term can also be rewritten using g-weights [35] (p. 14) which ensures some beneficial calibration of the auxiliary variables to the first-phase sample.

#### 4.2.2. Pseudo Synthetic Estimator (PSYNTH)

The PSYNTH estimator is commonly applied when no terrestrial sample is available within the small area G (i.e., ${n}_{2,G}=0$). The point estimate (Equation (16a)) is thus only based on the predictions generated by applying the globally derived regression coefficients to the small area cluster means of the explanatory variables ${\widehat{\overline{\mathit{Z}}}}_{c,G}$. Note that the bias correction term using the empirical residuals (Equation (14a)) can no longer be applied. Conditioned on the realized sample, the PSYNTH estimator can thus potentially have an unobservable design-based bias.

$$\begin{array}{cc}\hfill {\widehat{Y}}_{c,G,PSYNTH}& ={\widehat{\overline{\mathit{Z}}}}_{c,G}^{\top}{\widehat{\mathit{\beta}}}_{c,{s}_{2}}\hfill \end{array}$$

$$\begin{array}{cc}\hfill \widehat{\mathbb{V}}\left({\widehat{Y}}_{c,G,PSYNTH}\right)& ={\widehat{\overline{\mathit{Z}}}}_{c,G}^{\top}{\widehat{\Sigma}}_{{\widehat{\mathit{\beta}}}_{c,{s}_{2}}}{\widehat{\overline{\mathit{Z}}}}_{c,G}+{\widehat{\mathit{\beta}}}_{c,{s}_{2}}^{\top}{\widehat{\Sigma}}_{{\widehat{\overline{\mathit{Z}}}}_{c,G}}{\widehat{\mathit{\beta}}}_{c,{s}_{2}}\hfill \end{array}$$

The contribution to the variance by the model residuals in small area G can also no longer be considered (Equation (16b)). As a result, the synthetic estimator will usually have a smaller variance than estimators that consider the model residuals, but at the cost of a potential bias. Note that the PSYNTH estimations are still design-based, but one purely has to rely on the validity of the regression model within the small area as it is the case in the model-dependent framework.

#### 4.2.3. Extended Pseudo Synthetic Estimator (EXTPSYNTH)

The EXTPSYNTH estimator (Equation (17a,b)) has been proposed by Mandallaz [30] as a transformed version of the PSMALL estimator that has the form of the PSYNTH estimator but remains asymptotically design unbiased. It has the advantage that the mean of the empirical model residuals of the OLS regression model for the entire area F and the small area G are by construction both zero at the same time, i.e., ${\overline{\widehat{R}}}_{c}={\overline{\widehat{R}}}_{c,G}=0$. This is realized by extending the auxiliary vector ${\mathit{Z}}_{c}\left(x\right)$ by the indicator variable ${I}_{c,G}$ which takes the value 1 if the entire cluster lies within the small area G and 0 if the entire cluster is outside G, i.e., ${I}_{c,G}\left(x\right)=\frac{{M}_{G}\left(x\right)}{M\left(x\right)}$. The extended auxiliary vector thus becomes ${\mathbb{Z}}_{c}^{\top}\left(x\right)=({\mathit{Z}}_{c}^{\top}\left(x\right),{I}_{c,G}\left(x\right))$ and the new regression coefficient using ${\mathbb{Z}}_{c}\left(x\right)$ instead of ${\mathit{Z}}_{c}\left(x\right)$ in Equation (9a,b) is denoted as ${\widehat{\mathit{\theta}}}_{{s}_{2}}$. All remaining components are calculated by plugging in ${\mathbb{Z}}_{c}\left(x\right)$ in Equation (13a–c). A decomposition of ${\widehat{\mathit{\theta}}}_{{s}_{2}}$ reveals that the residual correction term is now included in the regression coefficient ${\widehat{\mathit{\theta}}}_{{s}_{2}}$ [35].

$$\begin{array}{cc}\hfill {\widehat{Y}}_{c,G,EXTPSYNTH}& ={\widehat{\overline{\mathbb{Z}}}}_{c,G}^{\top}{\widehat{\mathit{\theta}}}_{c,{s}_{2}}\hfill \end{array}$$

$$\begin{array}{cc}\hfill \widehat{\mathbb{V}}\left({\widehat{Y}}_{c,G,EXTPSYNTH}\right)& ={\widehat{\overline{\mathbb{Z}}}}_{c,G}^{\top}{\widehat{\Sigma}}_{{\widehat{\mathit{\theta}}}_{c,{s}_{2}}}{\widehat{\overline{\mathbb{Z}}}}_{c,G}+{\widehat{\mathit{\theta}}}_{c,{s}_{2}}^{\top}{\widehat{\Sigma}}_{{\widehat{\overline{\mathbb{Z}}}}_{c,G}}{\widehat{\mathit{\theta}}}_{c,{s}_{2}}\hfill \end{array}$$

However, it is important to note that ${\overline{\widehat{R}}}_{c,G}=0$ under the extended regression model only holds if the sample plots ${x}_{1},\dots ,{x}_{l}$ of a cluster are all either inside our outside the small area, i.e., ${M}_{G}\left(x\right)\equiv M\left(x\right)$, and thus ${I}_{c,G}\left(x\right)=\frac{{M}_{G}\left(x\right)}{M\left(x\right)}$ can only take the values 1 or 0. Mandallaz et al. [35] assumed that the effects on the estimates should be negligible as the number of occasions where ${M}_{G}\left(x\right)<M\left(x\right)$ was considered to be small in practical implementations. It was thus a further objective of this study to investigate the actual number of occurrences as well as effects of this phenomenon by comparing the estimates of EXTPSYNTH to those of PSMALL.

#### 4.3. Measures of Estimation Accuracy

The estimation precision was quantified by the estimation error, which is the ratio of the standard error and the point estimate (here $\widehat{Y}$ stands for the point estimate produced under the various estimators):

$$erro{r}_{[\%]}=\frac{\sqrt{\widehat{\mathbb{V}}\left(\widehat{Y}\right)}}{\widehat{Y}}\times 100$$

We further calculated the 95% confidence interval for each estimate. The confidence intervals were used heuristically for hypothesis testing to determine whether the point estimates of the three estimators for a given small area were statistically different. The confidence intervals for the SRS estimator can be obtained as:

$$C{I}_{1-\alpha}\left({\widehat{\overline{Y}}}_{c}\right)={\widehat{\overline{Y}}}_{c}\pm {t}_{{n}_{2}-1,1-\frac{\alpha}{2}}\sqrt{\widehat{\mathbb{V}}\left({\widehat{\overline{Y}}}_{c}\right)}$$

The confidence intervals for the PSMALL and EXTPSYNTH estimates are calculated as:

$$\begin{array}{cc}\hfill C{I}_{1-\alpha}\left({\widehat{Y}}_{c,G,EXTPSYNTH}\right)& ={\widehat{Y}}_{c,G,EXTPSYNTH}\pm {t}_{{n}_{2,G}-1,1-\frac{\alpha}{2}}\sqrt{\widehat{\mathbb{V}}\left({\widehat{Y}}_{c,G,EXTPSYNTH}\right)}\hfill \end{array}$$

$$\begin{array}{cc}\hfill C{I}_{1-\alpha}\left({\widehat{Y}}_{c,G,PSMALL}\right)& ={\widehat{Y}}_{c,G,PSMALL}\pm {t}_{{n}_{2,G}-1,1-\frac{\alpha}{2}}\sqrt{\widehat{\mathbb{V}}\left({\widehat{Y}}_{c,G,PSMALL}\right)}\hfill \end{array}$$

For the PSYNTH estimates, the confidence intervals are
with p being the number of parameters used in the regression model including the intercept term.

$$C{I}_{1-\alpha}\left({\widehat{Y}}_{c,G,PSYNTH}\right)={\widehat{Y}}_{c,G,PSYNTH}\pm {t}_{{n}_{2}-p,1-\frac{\alpha}{2}}\sqrt{\widehat{\mathbb{V}}\left({\widehat{Y}}_{c,G,PSYNTH}\right)}$$

In order to address the potential benefits of the small area estimators compared with the SRS approach, we calculated the relative efficiency (RE, Equation (22)) which can be interpreted as the relative sample size under SRS needed to achieve the variance under the double-sampling (DS) estimators.
where $\widehat{Y}$ stands for the point estimate produced under the respective estimator.

$$RE=\frac{\widehat{\mathbb{V}}\left({\widehat{Y}}_{SRS}\right)}{\widehat{\mathbb{V}}\left({\widehat{Y}}_{DS}\right)}$$

## 5. Case Study

#### 5.1. Study Area and Small Area Units

The German federal state Rhineland-Palatinate (RLP) is located in the western part of Germany and borders Luxembourg, France and Belgium. With 42.3% (appr. 8400 km${}^{2}$) of the entire state area (19,850 km${}^{2}$) covered by forest, RLP is one of the two states with the highest forest coverage among all federal states of Germany [2]. The forests of RLP are further characterised by a pronounced diversity in bioclimatic growing conditions that have strong influence on the local growth dynamics as well as tree species composition [38] and are further characterised by large variety of forest structures ranging from characteristic oak coppices (Moselle valley), pure spruce, beech and scots pine forests (i.a. Hunsrück and Palatinate forest) up to mixed forests comprising variable proportions of oak, larch, spruce, Scots pine and beech. Around 82% of the forest area in RLP are mixed forest stands and 69% of the forest area exhibit a multi-layered vertical structure. The forest area of RLP are divided into 3 ownership classes, i.e., state forest (27%), municipal forest (46%) and privately owned forest (27%). The forest service of RLP has the legal mandate to sustainably manage the state and municipal forest area (73% of the entire forest area), including forest planning, harvesting and the sale of wood [39]. For this reason, the entire forest area has been spatially organised in 3 main hierarchical management units (Figure 1). On the upper level, RLP has been divided into 45 forest districts (Forstämter, FA), which are further divided into a total number of 405 sub-districts (Forstreviere, FR). The next level are the forest stands (104,184 in total) for which expert judgements are conducted by SFIs in a 5 to 10 year period in order to set up management strategies for the upcoming 10 years. The FAs and FRs constituted the SA units for which design-based small area estimations of the mean standing timber volume were calculated by incorporating the available terrestrial inventory data of the BWI3 in the estimators described in Section 4. The average area of the SA units was 43,777 ha on the FA-level, and 4624 ha on the FR level.

#### 5.2. Terrestrial Sample

Rhineland-Palatinate is covered by a 2 × 2 km inventory grid of the German NFI. In the last inventory (BWI3) conducted in the year 2011 and 2012, timber volume information was derived for 2810 clusters (8092 plots) in the field survey. The local timber volume density on the plot and cluster level for this sample was consequently calculated according to Section 4.1. In the framework of this survey, the plot center coordinates were re-measured with the differential global satellite navigation system (DGPS) technique. Knowledge about the exact plot positions were considered crucial to provide optimal comparability between the terrestrial observations and the information derived from the auxiliary information. A comparison of the DGPS coordinates with the so-far used target coordinates revealed that 90% of all horizontal deviations lay in the range of 25 m. A detailed analysis of horizontal DGPS errors in RLP by Lamprecht et al. [40] indicated that 80% of the plots should not exceed horizontal DGPS errors of 8 m. For 162 plots, the DGPS coordinates were replaced by their target coordinates due to missingness or implausible values. The terrestrial sample size ${n}_{2,G}$ within the FA units was 46 clusters on average and ranged between 11 and 64. Within the FR units, ${n}_{2,G}$ was considerably smaller with an average of 5 clusters and a range between 0 and 13.

#### 5.3. Extension to Double-Sampling Design

In order to apply the small area estimators (Section 4.2), the existing NFI design was extended to a double-sampling cluster design by densifying the existing systematic 2 × 2 km grid to a grid size of 500 × 500 m that constituted the large first phase ${s}_{1}$ (Figure 1, right). The existing terrestrial phase ${s}_{2}$ was integrated by replacing the target coordinates of the respective ${s}_{1}$ clusters by the terrestrially measured DGPS coordinates. The sampling frame was further restricted to the municipal and state forest area. The forest/non-forest decision for each plot was thereby made by a spatial intersection of the plot center coordinates with a polygon layer of the municipal and state forest stand layer provided by the forest service. Using this stand layer provided the advantage to consistently apply the same forest/non-forest definition to the entire sample ${s}_{1}$ in order to decide about excluding or including a plot in the sampling frame. The terrestrial sample size ${n}_{2}$ was thus reduced to 2055 clusters (5791 plots). Table 1 provides a short descriptive summary about the volume densities and the main attributes of the NFI plots located in the state and municipal forest sampling frame. The densification led to an average sample size ${n}_{1,G}$ of 759 clusters (range: 246–1022) in the FA units, and 88 clusters (range: 1–194) in the FR units.

#### 5.4. Auxiliary Data

#### 5.4.1. LiDAR Canopy Height Model

A prerequisite for the application of the suggested two-phase small area estimators is the identification of suitable auxiliary data available over the entire study area. From 2003 to 2013, the topographic survey institution of RLP conducted an airborne laserscanning acquisition over the entire federal state during leaf-off conditions in order to derive a countrywide digital terrain model (DTM) as well as a digital surface model (DSM). For this study, the recorded ALS data was used to create a canopy height model (CHM) in raster format, providing discrete information about the canopy surface height of the forest area in a spatial resolution of 5 m (Figure 2, top). The CHM was calculated as the difference between the digital terrain model and the digital surface model that were derived by a Delauney interpolation of the ground and first ALS pulses respectively. A more detailed description of the procedure can be found in Hill et al. [41]. The CHM provided the most valuable information to be used in the OLS regression model for predicting the timber volume on the plot and cluster level. However, it should be noted that the prolonged acquisition period of the ALS campaign led to the possibility of poor temporal alignment with the BWI3 survey, sometimes up to 10 years. In addition, the quality of the CHM varied substantially as ALS technology evolved over the years. For example, the ALS acquisitions recorded in 2002 and 2003 exhibited particularly poor quality with about only 0.04 points per m${}^{2}$, whereas more recent datasets contained more than 5 points per m${}^{2}$. Furthermore, CHM information was not available at 16 sample locations due to sensor failures. These plots were deleted from the sampling frame and treated as missing at random. This assumption was considered to be reasonable as the respective sample locations did not systematically exclude specific forest structures.

#### 5.4.2. Tree Species Classification Map

Additional auxiliary data was derived from a countrywide satellite-based classification map predicting the five main tree species [42], i.e., European beech, Sessile and Pedunculate oak, Norway spruce, Douglas fir and Scots pine (Figure 2, bottom). The tree species map has a grid size of 5 × 5 m and was calculated from 22 bi-temporal satellite images (SPOT5 and RapidEye) using a spatially adaptive classification algorithm [43]. As timber volume estimation on the tree level is often based on species-specific biomass and volume equations, the use of tree species information has often been stated as a key factor for improving the precision of timber volume estimates [44]. In this respect, incorporating the tree species map was particularly attractive as it predicts five of the seven tree species that are used in the BWI3 taper functions [33] to calculate the timber volume of a sample tree. However, due to unavailable satellite data, the tree species map excluded one large patch with an area of 415 km${}^{2}$ in the south-west part of RLP covering an entire FA unit consisting of 10 FR units. In 9 additional FR units, the tree species information was also missing for a subset of the sample locations due to two additional patches with areas of 76 km${}^{2}$ and 100 km${}^{2}$ respectively in the northern part of RLP. For these 19 FR units, small area estimation was thus restricted to using only the available CHM information in the regression model. Thus, 411 of 5791 sample locations (approximately 7%) used to fit the regression model were affected by missing tree species information. A summary of the sample sizes and missing auxiliary data for both the CHM and the tree species map is provided in Table 2.

#### 5.5. Calculation of the Explanatory Variables

#### 5.5.1. Canopy Height Model

The continuous explanatory variables derived from the CHM were the mean canopy height (meanheight) and the standard deviation (stddev). The quantities were calculated by evaluating the raster values around each sample location within a circle with a predefined radius of 12 m, i.e., the support. In order to correct for edge effects at the forest border, the intersection of each support area to the state and municipal forest area was determined using a polygon mask provided by the state forest service. The percentage of the support within the forest layer was used as the weight $w\left({x}_{l}\right)$ introduced in Equation (10) in order to derive the weighted mean of the explanatory variables on the cluster level. Neglecting the support adjustment would deteriorate the coherence between explanatory variables computed at the forest boundary and the corresponding local density that already includes a potential boundary adjustment, thus introducing unnecessary noise to the model. The boundary adjustment to the support also makes the sampling frame more consistent for the different data sources (Section 5.3).

The ALS acquisition year (ALSyear) was added as a categorical variable in order to account for the time lag with the terrestrial survey as well as to help explain the heterogeneity in the data introduced by the varying ALS quality. In 2008, a sensor error produced particularly poor ALS quality so the year was divided accordingly into two factor levels, denoted 2008_1 and 2008. Furthermore, in order to increase the number of observations per factor level the years 2006 and 2007 were pooled together and the same was done for 2012 and 2013. The result was nine factor levels denoted as 2002, 2003, 2007, 2008_1, 2008, 2009, 2010, 2011 and 2012.

#### 5.5.2. Tree Species Classification Map

The tree species map was used to predict the main tree species at each sample plot which served as an additional categorical variable treespecies in the regression model. In the first step, one of the five tree species was assigned to a sample location if 100% of the raster values within the edge-corrected support were classified as that species. Otherwise, the sample location was assigned the value ’mixed’. Likewise for the CHM variables, the support radius was 12 m although the use of different support sizes for each explanatory variable would be in agreement with the two-phase estimators presented in Section 4.2. The specific setting for the support size and the percentage threshold was found to be optimal in order to yield the best possible regression model precision when incorporating the treespecies variable as an additional predictor. In a second step, the treespecies variable was also passed through a calibration model in order to reduce the effects of misclassification errors on the regression model coefficients and to increase model accuracy. The calibration model consisted of a decision tree from a random forest algorithm [45] that was trained to predict the actual main plot tree species (known for all terrestrial plots) based on available auxiliary variables. These variables were the predicted treespecies variable, the mean canopy height and standard deviation of the CHM, as well as the proportion of coniferous trees estimated from the classification map and the growing region derived from a polygon map. The algorithm was grown with 2000 trees considering 3 of the predictors for each split. We thus applied this calibration model to the treespecies variable derived at all sample locations ${s}_{1}$. Table 3 gives the classification accuracies [46] of the treespecies variable after calibration. More details on the processing of the explanatory variables and identification of optimal parameter settings for their calculation are described in Hill et al. [41].

#### 5.6. Regression Model

The model selection process for this study required a substantial time commitment due to sophisticated challenges such as: (a) the heterogeneity of the remote sensing data; (b) the identification of the optimal support sizes under angle count sampling and (c) the incorporation of tree species information. Here, only a summary of the extensive analysis that was performed is provided but the reader can refer to Hill et al. [41] if more details are desired.

The model with highest adjusted ${R}^{2}$ and lowest RMSE was achieved using meanheight, meanheight${}^{2}$, stddev, ALSyear and treespecies as main effects, and including interaction terms between meanheight and ALSyear, stddev and ALSyear, meanheight and stddev, and meanheight and treespecies. Summary information about the adjusted ${R}^{2}$, RMSE and RMSE% of the selected models is provided in Table 4. As the two-phase estimators described in Section 4.2 derive and apply the regression coefficients and the residuals on the aggregated cluster level, we re-evaluated the model as used in the estimators on the cluster level (formulas given in Appendix) and found improved model fits compared to the plot level (adjusted ${R}^{2}$ of 0.59 and RMSE of 101.61 m${}^{3}$/ha and 33.6%). Using the ALS acquisition year as a categorical variable substantially improved the model fit, indicating that it is an effective means in accounting for the noise in the data caused by ALS quality variations and time-gaps between the ALS and the terrestrial survey. However, this also led to a highly unbalanced data set when introducing the treespecies variable as an additional categorical predictor. For this reason, a individual species modeling within each ALSyear stratum remained infeasible, but might have further improved the model fit. An additional evaluation of the model’s performance within each ALS acquisition year stratum revealed that the quality of the model fit substantially varied between the strata (Table 5). In particular, values above the overall adjusted ${R}^{2}$ were higher in ALS acquisition years close to the terrestrial survey date compared to years with larger time gaps.

As described in Section 5.4.2, the information of the tree species classification map was missing within 1 FA and 19 FR units. For these small area units, we applied the regression model without the treespecies variable (Table 4, reduced model). However, the adjusted ${R}^{2}$s of the full and reduced model were found to be very similar on both the plot and cluster level. This implied that the variance reduction of the reduced model when applied to the two-phase estimators would likely be comparable to that of the full model. For this reason, a joint evaluation of the estimation results is performed in Section 6.

Concerning the treatment of outliers or leverage points, it can be advisable to remove such observations from the training data set that is used to determine the regression coefficients (Section 4.2) in order to minimize the residual variance for the entire terrestrial sample. However, it should be noted that such removal of observations does restrict the calculation of design-unbiased estimation to the PSMALL estimator, because the residuals still have to be derived for the full terrestrial sample in order to ensure unbiased estimation. This would no longer be satisfied when using the EXTPSYNTH estimator, where the residual correction term is included in the regression coefficient (Section 4.2.3). For our particular case, we conducted an analysis of influential observations [47] (pp. 160–167) on the plot level for the full regression model. We calculated the leverage values and found that 10% of all observations exceeding a predefined critical threshold, i.e., twice the average of the hat matrix diagonal entries. Further investigation revealed that several leverage points showed unusually large meanheight values compared to their respective timber volume densities. They tended to occur in ALS acquisition years with longer time gaps to the terrestrial survey date and were thus more likely caused by harvesting activities in the sample plot area. Although these areas likely affected by harvest should clearly not be removed from the sampling frame, it does provide more justification for the inclusion of the ALSyear variable to mitigate the implied effects.

## 6. Results

#### 6.1. General Estimation Results

An application of the SRS, PSMALL and EXTPSYNTH estimator was not feasible for 17 of all 405 FR-units due to an insufficient terrestrial sample size of ${n}_{2,G}<2$. We further restricted the calculation of the PSMALL and EXTPSYNTH estimator to small area units with a minimum terrestrial sample size of ${n}_{2,G}\ge 4$ to avoid unstable estimates. This affected 65 additional FR units and limited unbiased two-phase estimations to 321 (79%) of the 405 FR units. Also the PSYNTH estimator could not be applied for 2 FR-units since ${n}_{1,G}<2$. For better comparison, the descriptive summaries of point estimates and estimation errors for each estimator presented in Table 6 are always based on the same population of small area units, i.e., the 321 districts for which unbiased two-phase estimations were possible.

All estimators could however be applied to all 45 FA units due to substantially larger sample sizes. The average value and the range of the mean timber volume estimates over the evaluated FA and FR units turned out to be very similar between all estimators (Table 6). An additional pairwise comparison of the 95% confidence intervals revealed that the four estimators did in fact not produce statistically different point estimates for all FA and FR units. This confirmed that the differences between the estimators are solely found in the precision which they provide for the point estimates.

#### 6.2. Estimation Errors

On both small area levels, the design-unbiased estimators PSMALL and EXTPSYNTH led to a substantial reduction in the estimation error compared to the SRS estimator (Figure 3). On the FA level, the SRS estimator yielded an estimation error of 6.7% on average compared to 5.2% and 4.8% under EXTPSYNTH and PSMALL respectively (Table 6). The cumulative error distribution (Figure 3, left) reveals that under the SRS estimator, errors less than 5% were achieved for 17% of the FA units (8 of 45). This proportion could be increased to 62% (28 FA units) and 73% (33 FA units) by application of the PSMALL and EXTPSYNTH estimator. 95% of all estimates exhibited errors less than 9.5% under the SRS estimator and less than 6.6% when using PSMALL or EXTPSYNTH. Estimation errors higher than 10% only appeared twice for each of the three estimators.

Although the estimation errors were substantially larger overall on the FR level compared to the FA level due to smaller sample sizes, the error reduction from SRS by PSMALL and EXTPSYNTH were even more pronounced (Figure 3, right). The average error under the SRS estimator was 16.9%, while it was 11.3% and 12.2% under PSMALL and EXTPSYNTH (Table 6). Errors smaller than 10% were achieved for 15% of the FR units by the SRS estimator, and for 46% by the PSMALL and EXTPSYNTH estimator. 95% of the 321 FR units where PSMALL and EXTPSYNTH could be applied exhibited errors less than 20%. In comparion, the SRS estimates resulted in errors less than 32.3% for 95% of the 321 evaluated FR units.

On both small area levels, the PSYNTH estimator resulted in much smaller estimation errors compared to PSMALL and EXTPSYNTH. This was as expected, since the PSYNTH variance estimate does not take the residual variation in each small area unit into account (Section 4.2.2). Compared to the asymptotically design-unbiased estimators PSMALL and EXTPSYNTH, the estimation errors produced by PSYNTH thus seem to be too optimistic. One should also recall that the estimates of the PSYNTH estimator are potentially design-biased.

#### 6.3. Comparison of PSMALL and EXTPSYNTH

Figure 3 reveals that the error distribution of PSMALL and EXTPSYNTH are very similar, with PSMALL showing marginally higher estimation errors. In order to investigate the differences between PSMALL and EXTPSYNTH, we compared the g-weight variances of both estimators for all 321 FR units (Figure 4, left). As obvious, PSMALL yielded slightly larger variances for the vast majority of the estimates. As addressed in Section 4.2.3, one possible explanation for differences was the effect of one or more clusters not entirely being included in a small area unit, as this would constitute an assumption violation of the EXTPSYNTH estimator. This violation was actually observed in 155 of the 321 FR units (48%). We compared the variances of PSMALL and EXTPSYNTH for all small areas that did not have the violations using a Wilcoxon Signed-Rank Test [48] on a 5% significance level. This test was also performed pairwise for groups ${n}_{2,G}\le 6$, ${n}_{2,G}>6$ and ${n}_{2,G}>10$. The distribution of variances from EXTPSYNTH was found to be highly significantly lower than that of PSMALL except for the group of ${n}_{2,G}>10$. The latter was expected since the variances of both estimators are asymptotically equivalent under large terrestrial sample sizes ${n}_{2,G}$ within the small area [35] (pp. 17–18). This was also confirmed by a visual comparison of the absolute differences in the variances (Figure 4, right) which decreased with increasing terrestrial sample size. Performing the same comparison for small areas with violations also revealed the EXTPSYNTH variances to be significantly smaller than the respective PSMALL variances until sample sizes ${n}_{2,G}>10$. Based on these investigations, it was not possible to determine whether the differences for sample sizes smaller than 10 were caused by the violations or just reflect the general tendency of EXTPSYNTH to produce smaller variances than PSMALL under small sample sizes. However, a visual inspection provided some evidence that the violations created a statistically significant influence on the EXTPSYNTH variance (Figure 4, left, red diamonds) that makes it appear to be slightly over-optimistic. For sample sizes of ${n}_{2,G}<6$, a weakly significant difference between the EXTPSYNTH variances of those small areas with violations and the EXTPSYNTH variances without violation was also indicated by an unpaired Wilcoxon Rank-Sum Test. However, the differences were still marginal and a comparison of the confidence intervals of PSMALL and EXTPSYNTH revealed that the variance differences did not lead to statistically significant point estimates.

#### 6.4. Variance Reduction Compared to SRS

The variance reduction relative to SRS for PSMALL and EXTPSYNTH are described in Figure 5 and Table 7. A direct comparison of the variances within the small area units revealed that the application of the design-unbiased estimators (PSMALL, EXTPSYNTH) led to a variance reduction compared to SRS in all FA units. In 75% of the FA units, the EXTPSYNTH estimator was able to reduce the variance by up to 54.1%. The reduction in variance can also be expressed in the relative efficiency values, which were 2.02 on average and ranged between 1.18 and 4.13 on the FA level. On FR level, the reduction in variance even reached values of 90% and relative efficiencies of 30 (Table 7 and Figure 5). The PSMALL estimator again yielded slightly lower variance reductions and relative efficiencies due to the generally smaller variances of the EXTPSYNTH estimator (Section 6.3).

Cases also occurred on the FR level where one or both two-phase estimators produced larger variance values than under the SRS estimator. This happened in 19% of the FR units under the EXTPSYNTH, and in 24% of the FR units under the PSMALL estimator. One possible reason for this was supposed to be a large residual variance due to a poor performance of the regression model within the small area unit. In order to investigate this hypothesis, we analyzed the three variance terms of the PSMALL estimator (Equation (14b)), i.e., the variance introduced by the uncertainty of the regression coefficients (term 1), the variance caused by estimating the auxiliary means (term 2), and the variance of the model residuals (term 3). In general, the residual term is expected to make the largest contribution to the overall variance since it’s sample size is based on ${n}_{2,G}$ whereas the auxiliary term and the coefficient term are based on larger sample sizes, i.e., ${n}_{1,G}$ and ${n}_{2}$ respectively. Figure 6 illustrates the share of the overall variance by the residual term of the PSMALL estimator scaled by the overall percentage reduction or increase of the variance compared to SRS for various small area sample sizes ${n}_{2,G}$. Not surprisingly, the residual term generally constitutes the dominating part of the PSMALL variance (around 84% on average). It has to be noted that such high residual term dominance does not necessary indicate that the PSMALL variance will be disproportionately large (Figure 6, right). However, the vast majority of cases where the PSMALL variance was considerably larger than the SRS variance occurred where the residual term contributed over 75% to the overall PSMALL variance (Figure 6, left). Among those cases, the most pronounced were observed under small sample sizes ${n}_{2,G}<5$. Here, the average increase in variance compared to SRS of those FR units with ${n}_{2,G}=4$ was 272%, compared to 62% for FR units with ${n}_{2,G}>4$. In contrast, the decreases in variance compared to SRS (Figure 6, right) were much more homogeneous in magnitude and also independent of the terrestrial sample size. Since ${n}_{2,G}$ is the same for PSMALL and SRS, these observations imply that in the problematic small areas, the sum of square residuals for the regression model are likely larger than the sum of square local densities for the clusters in ${s}_{2,G}$. This indicates the presence of outliers with large residuals, which likely arise when there was forest loss after the ALS scanning but before the terrestrial survey year.

## 7. Discussion

#### 7.1. Performance of Estimators

With the objective of extending the use of the German NFI data to additional estimation on small-scale management levels, we evaluated the performance of design-based small area regression estimators with respect to their suitability for future operational large scale application. For this reason, we conducted a case study in the German federal state of Rhineland-Palatinate where we applied the SRS, the PSMALL and the EXTPSYNTH estimators to produce estimates of the mean timber volume on two forest management levels over the entire federal state area, comprising 45 and 405 small area units respectively. In order to assess and compare the performance of the estimators, it was of particular interest to gather information about the magnitudes of estimation precision they can provide.

Our study showed that on both small area levels, the PSMALL and the EXTPSYNTH estimators generally led to a substantial reduction in estimation error compared to the standard one-phase SRS estimator. On the upper management level (FA districts), PSMALL and EXTPSYNTH produced estimation errors smaller than 5% for 73% of the small areas compared to only 17% under the one-phase SRS estimator. It could be argued that the majority of the FA units comprised a sample size (46 clusters on average) that would also have allowed to build an individual prediction model for each FA unit and apply direct estimation, which can be more efficient. The main reason why this was not considered in the present study was the large number of parameters that would have to be fitted in the individual prediction models resulting from the pronounced heterogeneity in the auxiliary data (multiple tree species and ALS acquisition years in each FA unit). In this case, the strategy to ‘borrow strength’ from the entire inventory domain was preferred to direct estimation in order to avoid overfitting and the implied risk of unstable global estimation. Even in case the above mentioned problem of overfitting is not raised, individual model building for a large number of small area units can be time-consuming and small area estimation can be a cost-saving alternative- however, possibly the cost of more efficient global estimation.

The level of precision of the FA-level could not be achieved on the lower management level (FR districts) primarily due to substantially smaller terrestrial sample sizes. However, in 95% of the FR units, the estimation errors could be limited to 20% compared to 40% under SRS. A pairwise comparison of the confidence intervals revealed that the estimators did not produce significantly different point estimates. The much smaller estimation errors of the PSYNTH estimator reflected the fact that it does not try to correct for potential bias in the point estimate which can lead to overly optimistic estimation errors and confidence intervals. One should thus prefer the unbiased estimates of PSMALL or EXTPSYNTH whenever their calculation is possible.

For several FR units, it was observed that the PSMALL and the EXTPSYNTH estimator can occasionally produce larger variances than the SRS estimator. It is important to note that this is in perfect agreement with the theory of both two-phase estimators and can theoretically appear if the residual variance in the small area, which generally constitutes the dominating part of the two-phase variance, turns out to be much higher than the variance of the terrestrial data in the small area. The empirical findings of our study suggest that such cases can particularly occur if moderate or poor model fits within a small area are combined with small terrestrial sample sizes (≤5) in the small area. A closer look on these small areas thus might reveal the reason for the poor prediction performance and help to improve the model fit. Nonetheless, it should be kept in mind that small terrestrial sample sizes can also cause the SRS estimator to not reflect the actual variation of the local density within a small area. In this case, the two-phase variance estimate might be larger but more realistic. Whereas a visual analysis of aerial images, remote sensing data or stand maps might give some further evidence for or against this hypothesis, a definite proof is practically infeasible.

We were also able to empirically confirm that the EXTPSYNTH estimator generally produces slightly smaller variances and estimation errors than the PSMALL estimator. This is most probably caused by marginally smaller model residuals due to the intercept adjustment to the terrestrial data in the small area unit, which is primarily a means to ensure the zero mean residual property of the EXTPSYNTH estimator. However, our analysis indicated that the difference between the two estimators is negligible for sample sizes ≥10 due to their asymptotic equivalency. We further investigated a potential impact on the EXTPSYNTH variance caused by the assumption violation that one or more clusters are not entirely included in the small area unit and found a slight but statistically significant tendency to be over-optimistic for sample sizes smaller than 6. More empirical evidence must be gathered before generalizing this as a rule of thumb for the application of the EXTPSYNTH under cluster sampling. It thus seems recommendable to prefer the EXTPSYNTH to the PSMALL estimator if its assumptions are not violated since it yields slightly smaller variances under mathematically soundness. Even if the differences between both estimators were marginal and did not lead to significantly different point estimates, PSMALL can serve as a safe alternative if the EXTPSYNTH assumption is violated. Aside from this, calculating both PSMALL and EXTPSYNTH and subsequently compare their results is always recommended to reveal suspicious deviations.

A commonly raised critic on the proposed design-unbiased estimators are their asymptotic properties, i.e., the validity of the confidence coverage rates is only ensured if the sample size in a small area is sufficiently large. Giving a generally valid sample size for the asymptotic validity range is, unfortunately, infeasible due to the dependency on the heterogeneity of the underlying population which is per se unknown. However, simulation studies for simple random sampling presented in Mandallaz et al. [25] suggested that a minimum sample size of 6 within a small area is sufficient to ensure the nominal coverage rates of the confidence intervals for PSMALL and EXTPSYNTH. Re-evaluating the same simulation example recently confirmed the same results for cluster sampling.

#### 7.2. Auxiliary Data

The auxiliary data used in our study were derived from two remote sensing sources, i.e., an ALS canopy height model and a tree species classification map. Likewise in many similar studies, the ALS mean canopy height proved to be the explanatory variable with highest predictive power. However, the large time-gaps of up to 10 years between the ALS acquisition and the terrestrial survey date caused the substantial introduction of artificial noise in the data. Whereas a post-stratification to the ALS acquisition years was an effective means to counteract the implied residual inflation, several leverage points were unambiguously caused by the temporal asynchronicity. Undetectable forest loss during the gap between the ALS acquisition and the NFI was also likely a cause for high residual variance in some small area units compared to the terrestrial data variance, which subsequently led to higher variances than the SRS estimator. As opposed to the ALS data, the availability of a country-wide tree species classification map has yet been unique among all German federal states. Whereas the study of Hill et al. [41] already showed that the tree species information was able to improve the model fit, it has yet not been used to its full potential. One reason for this was the impossibility of modeling individual tree species within each ALS acquisition year, which would add further explanatory power. Another reason was the lack of available satellite data for classification in some parts of the country, which led to missing values in the inventory data and restricted 19 FR units to a simpler regression model. Promising steps with respect to more up-to-date canopy height information have already been made, as the topographic survey institution of RLP will from this year on provide a country-wide canopy height model derived from aerial imagery acquisitions. These campaigns will in the future be conducted in a two-year period and allow to derive canopy height information matching the dates of terrestrial forest inventories. A study of Kirchhoefer et al. [49] recently indicated that similar model performance for German NFI data can be achieved using such imagery-based canopy height models. Additionally, the improved coverage and repetition rate of the Sentinel-2 satellite [50] will allow to produce annually updated tree species classification maps. We consider these alternative auxiliary data sources to also solve the problem of missing explanatory variables at inventory plots. One could also make use of the exhaustive information within the two-phase estimators by using the true auxiliary means [25,30], which could further decrease estimation errors. Previous studies of Mandallaz et al. [25] however showed that given a reasonable large sample size of the first phase, the differences in the estimation error are usually small. With respect to the substantial improvements in the temporal synchronicity between auxiliary and terrestrial inventory data, we consider the demonstrated double-sampling approach also to be very efficient for the estimation of change [51].

## 8. Conclusions

The study led to two major conclusions: (1) the EXTPSYNTH and PSMALL estimator generally achieved substantially smaller estimation errors on the two investigated forest district levels compared to the SRS estimator. Thus, the demonstrated small area estimation procedure constitutes a major contribution to an additional use of the German NFI data for estimation below the federal state level. Further close cooperation with the forest authorities is crucial to evaluate whether the achieved error levels are already sufficient enough in order to support forest planning decisions. A first study will concentrate on testing the EXTPSYNTH and PSMALL confidence intervals as a validation source for the stand-wise inventories. (2) Despite the quality restrictions, the ALS data and the tree species map were found to be well suited to model the mean timber volume on the plot and cluster level. With the prospect of more frequently updated aerial canopy height models and tree species maps, the two data sources will become even more attractive to be used as an integral part of future operational applications. The improving availability of remote sensing data will also allow to extent the demonstrated estimation procedure to the estimation of change. We consider this to be one of the next milestones towards a future operational use of the demonstrated small area estimation procedure.

## Author Contributions

A.H. conducted the study and wrote the manuscript. D.M. developed the design-based estimators and supported the statistical analysis. J.L. supported the study on the part of the State Forest Service Rhineland-Palatinate and cross-checked the analysis and the manuscript.

## Funding

This research received no external funding.

## Acknowledgments

We want to express our gratitude to Prof. H. Heinimann (Chair of Land Use Engineering, ETH Zurich) for supporting this study. We also want to explicitly thank Johannes Stoffels and Henning Buddenbaum from the Environmental Sensing and Geoinformatic Group of University of Trier for providing the ALS data and tree species classification map, and Kai Husmann and Christoph Fischer from the Northwest German Forest Research Institute Göttingen for their advice in processing the terrestrial inventory data. Special gratitude is also owed to Thomas Riedel from the Thünen Institute for providing the densified NFI sample grid, and Alexander Massey for proofreading.

## Conflicts of Interest

The authors declare no conflict of interest.

## Appendix A

#### Appendix A.1. R-Squared on Cluster Level

The R${}^{2}$ on the cluster level is calculated using the number of plots $M\left(x\right)$ of each cluster in order to weight for the varying number of plots on which ${Y}_{c}\left(x\right)$ and ${\widehat{Y}}_{c}\left(x\right)$ are based on.

$${R}^{2}=\frac{{\sum}_{x\in {s}_{2}}{\left(\frac{M\left(x\right)}{\overline{{M}_{2}}}\right)}^{2}{\left({\widehat{Y}}_{c}\left(x\right)-{\widehat{\overline{Y}}}_{c}\right)}^{2}}{{\sum}_{x\in {s}_{2}}{\left(\frac{M\left(x\right)}{\overline{{M}_{2}}}\right)}^{2}{\left({Y}_{c}\left(x\right)-{\widehat{\overline{Y}}}_{c}\right)}^{2}}$$

${Y}_{c}\left(x\right)$ and ${\widehat{Y}}_{c}\left(x\right)$ are the predicted and observed local densities on the cluster level calculated according to Equations (2) and (12). ${\widehat{\overline{Y}}}_{c}$ is the estimated sample mean corresponding to the weighted mean over all observed local densities on the cluster level (Equation (8a,b)).

#### Appendix A.2. RMSE on Cluster Level

The same weights $M\left(x\right)$ are also applied to calculate the RMSE on the cluster level. ${n}_{2}$ is the number of clusters used in the modeling frame.

$$RMSE=\sqrt{\frac{1}{{n}_{2}}{\sum}_{x\in {s}_{2}}{\left(\frac{M\left(x\right)}{{\overline{M}}_{2}}\right)}^{2}{\left({\widehat{Y}}_{c}\left(x\right)-{Y}_{c}\left(x\right)\right)}^{2}}$$

The relative or normalized RMSE is calculated by dividing the RMSE by the estimated sample mean ${\widehat{\overline{Y}}}_{c}$:

$$RMSE[\%]=\frac{RMSE}{{\widehat{\overline{Y}}}_{c}}$$

Note that the weights $\frac{M\left(x\right)}{\overline{{M}_{2}}}\equiv 1$ if the number of plots per cluster is constant.

## References

- Polley, H.; Schmitz, F.; Hennig, P.; Kroiher, F. National Forest Inventories—Pathways for Common Reporting; Springer: Heidelberg/Berlin, Germany, 2010; Chapter 13; pp. 223–243. [Google Scholar]
- Thünen-Institut. Dritte Bundeswaldinventur 2012, 2014. Available online: https://bwi.info (accessed on 3 February 2012).
- Kuliešis, A.; Tomter, S.M.; Vidal, C.; Lanz, A. Estimates of stem wood increments in forest resources: comparison of different approaches in forest inventory: Consequences for international reporting: case studyof European forests. Ann. For. Sci.
**2016**, 73, 857–869. [Google Scholar] [CrossRef] - Böckmann, T.; Saborowski, J.; Dahm, S.; Nagel, J.; Spellmann, H. Neukonzeption und Weiterentwicklung der Forsteinrichtung in Niedersachsen. Forst und Holz (Germany)
**1998**, 53, 298–302. [Google Scholar] - Von Lüpke, N.; Hansen, J.; Saborowski, J. A Three-Phase Sampling Procedure for Continuous Forest Inventory with Partial Re-measurement and Updating of Terrestrial Sample Plots. Eur. J. For. Res.
**2012**, 131, 1979–1990. [Google Scholar] [CrossRef] - Von Lüpke, N. Approaches for the Optimisation of Double Sampling for Stratification in Repeated Forest Inventories. Ph.D. Thesis, University of Göttingen, Göttingen, Germany, 2013. [Google Scholar]
- De Vries, P.G. Sampling Theory for Forest Inventory: A Teach-Yourself Course; Springer: Heidelberg/Berlin, Germany, 1986. [Google Scholar]
- Cochran, W.G. Sampling Techniques; Wiley Series in Probability and Mathematical Statistics; Wiley: Hoboken, NJ, USA, 1977. [Google Scholar]
- Särndal, C.E.; Swensson, B.; Wretman, J. Model Assisted Survey Sampling; Springer Science & Business Media: Heidelberg/Berlin, Germany, 2003. [Google Scholar]
- Gregoire, T.G.; Valentine, H.T. Sampling Strategies for Natural Resources and the Environment; CRC Press: Boca Raton, FL, USA, 2007. [Google Scholar]
- Köhl, M.; Magnussen, S.S.; Marchetti, M. Sampling Methods, Remote Sensing and GIS Multiresource Forest Inventory; Springer Science & Business Media: Heidelberg/Berlin, Germany, 2006. [Google Scholar]
- Mandallaz, D. Sampling Techniques for Forest Inventories; CRC Press: Boca Raton, FL, USA, 2008. [Google Scholar]
- Gillis, M.; Boudewyn, P.; Power, K.; Russo, G. Canada. In National Forest Inventories—Pathways for Common Reporting; Springer: Heidelberg/Berlin, Germany, 2010; Chapter 4; pp. 97–111. [Google Scholar]
- Chojnacky, D.C. Double Sampling for Stratification: A Forest Inventory Application in the Interior West; Technical Report; US Department of Agriculture, Forest Service, Rocky Mountain Research Station Ogden: Washington, DC, USA, 1998. [Google Scholar]
- Lanz, A.; Brändli, U.B.; Brassel, P.; Ginzler, C.; Kaufmann, E.; Thürig, E. Switzerland. In National Forest Inventories–Pathways for Common Reporting; Springer: Heidelberg/Berlin, Germany, 2010; Chapter 36; pp. 555–565. [Google Scholar]
- Gasparini, P.; Tosi, V.; DiCosmo, L. Italy. In National Forest Inventories—Pathways for Common Reporting; Springer: Heidelberg/Berlin, Germany, 2010; Chapter 19; pp. 311–331. [Google Scholar]
- Saborowski, J.; Marx, A.; Nagel, J.; Böckmann, T. Double sampling for stratification in periodic inventories—Infinite population approach. For. Ecol. Manag.
**2010**, 260, 1886–1895. [Google Scholar] [CrossRef] - Grafström, A.; Schnell, S.; Saarela, S.; Hubbell, S.; Condit, R. The continuous population approach to forest inventories and use of information in the design. Environmetrics
**2017**, 28. [Google Scholar] [CrossRef] - Massey, A.; Mandallaz, D.; Lanz, A. Integrating remote sensing and past inventory data under the new annual design of the Swiss National Forest Inventory using three-phase design-based regression estimation. Can. J. For. Res.
**2014**, 44, 1177–1186. [Google Scholar] [CrossRef] - Mandallaz, D. A three-phase sampling extension of the generalized regression estimator with partially exhaustive information. Can. J. For. Res.
**2013**, 44, 383–388. [Google Scholar] [CrossRef] - Rao, J.N. Small-Area Estimation; Wiley Online Library: Hoboken, NJ, USA, 2015. [Google Scholar]
- Breidenbach, J.; Astrup, R. Small area estimation of forest attributes in the Norwegian National Forest Inventory. Eur. J. For. Res.
**2012**, 131, 1255–1267. [Google Scholar] [CrossRef] - Goerndt, M.E.; Monleon, V.J.; Temesgen, H. A comparison of small-area estimation techniques to estimate selected stand attributes using LiDAR-derived auxiliary variables. Can. J. For. Res.
**2011**, 41, 1189–1201. [Google Scholar] [CrossRef] - Steinmann, K.; Mandallaz, D.; Ginzler, C.; Lanz, A. Small area estimations of proportion of forest and timber volume combining Lidar data and stereo aerial images with terrestrial data. Can. J. For. Res.
**2013**, 28, 373–385. [Google Scholar] [CrossRef] - Mandallaz, D.; Breschan, J.; Hill, A. New regression estimators in forest inventories with two-phase sampling and partially exhaustive information: A design-based monte carlo approach with applications to small-area estimation. Can. J. For. Res.
**2013**, 43, 1023–1031. [Google Scholar] [CrossRef] - Koch, B. Status and future of laser scanning, synthetic aperture radar and hyperspectral remote sensing data for forest biomass assessment. ISPRS J. Photogramm. Remote Sens.
**2010**, 65, 581–590. [Google Scholar] [CrossRef] - Naesset, E. Area-Based Inventory in Norway—From Innovation to an Operational Reality. In Forest Applications of Airborne Laser Scanning—Concepts and Case Studies; Springer: Heidelberg/Berlin, Germany, 2014; Chapter 11; pp. 216–240. [Google Scholar]
- Magnussen, S.; Mauro, F.; Breidenbach, J.; Lanz, A.; Kändler, G. Area-level analysis of forest inventory variables. Eur. J. For. Res.
**2017**, 136, 839–855. [Google Scholar] [CrossRef] - Magnussen, S.; Mandallaz, D.; Breidenbach, J.; Lanz, A.; Ginzler, C. National forest inventories in the service of small area estimation of stem volume. Can. J. For. Res.
**2014**, 44, 1079–1090. [Google Scholar] [CrossRef] - Mandallaz, D. Design-based properties of some small-area estimators in forest inventory with two-phase sampling. Can. J. For. Res.
**2013**, 43, 441–449. [Google Scholar] [CrossRef][Green Version] - Bundesministerium für Ernährung, L.u.V. Aufnahmeanweisung für die Dritte Bundeswaldinventur BWI3 (2011–2012); BMELV: Berlin, Germany, 2011. [Google Scholar]
- Bitterlich, W. The Relascope Idea. Relative Measurements in Forestry; Commonwealth Agricultural Bureaux: Wallingford, UK, 1984. [Google Scholar]
- Kublin, E.; Breidenbach, J.; Kändler, G. A flexible stem taper and volume prediction method based on mixed-effects B-spline regression. Eur. J. For. Res.
**2013**, 132, 983–997. [Google Scholar] [CrossRef] - Schmitz, F.; Polley, H.; Hennig, P.; Dunger, K.; Schwitzgebel, F. Die Zweite Bundeswaldinventur-BWI2: Inventur- und Auswertmethoden, Bundesministerium fur Ernahrung, Land-Wirtschaft und Verbraucherschutz (Hrsg); BMELV: Berlin, Germany, 2008. [Google Scholar]
- Mandallaz, D.; Hill, A.; Massey, A. Design-Based Properties of Some Small-Area Estimators in Forest Inventory with Two-Phase Sampling—Revised Version; Technical Report; Department of Environmental Systems Science, ETH Zurich: Zürich, Switzerland, 2016. [Google Scholar]
- Hill, A.; Massey, A. Forestinventory: Design-Based Global and Small-Area Estimations for Multiphase Forest Inventories. 2017. Available online: https://cran.r-project.org/web/packages/forestinventory/forestinventory.pdf (accessed on 16 October 2017).
- R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2018. [Google Scholar]
- Gauer, J.; Aldinger, E. Waldökologische Naturräume Deutschlands-Wuchsgebiete. Mitt. Ver. Forstliche Standortskunde Forstpflanzenzüchtung
**2005**, 43, 281–288. [Google Scholar] - LWaldG. Landeswaldgesetz Rheinland-Pfalz (Forest Act Rhineland-Palatinate); Rhineland: Palatinate, Germany, 2000. [Google Scholar]
- Lamprecht, S.; Hill, A.; Stoffels, J.; Udelhoven, T. A Machine Learning Method for Co-Registration and Individual Tree Matching of Forest Inventory and Airborne Laser Scanning Data. Remote Sens.
**2017**, 9. [Google Scholar] [CrossRef] - Hill, A.; Buddenbaum, H.; Mandallaz, D. Combining canopy height and tree species map information for large-scale timber volume estimations under strong heterogeneity of auxiliary data and variable sample plot sizes. Eur. J. For. Res.
**2018**. [Google Scholar] [CrossRef] - Stoffels, J.; Hill, J.; Sachtleber, T.; Mader, S.; Buddenbaum, H.; Stern, O.; Langshausen, J.; Dietz, J.; Ontrup, G. Satellite-Based Derivation of High-Resolution Forest Information Layers for Operational Forest Management. Forests
**2015**, 6, 1982–2013. [Google Scholar] [CrossRef][Green Version] - Stoffels, J.; Mader, S.; Hill, J.; Werner, W.; Ontrup, G. Satellite-based stand-wise forest cover type mapping using a spatially adaptive classification approach. Eur. J. For. Res.
**2012**, 131, 1071–1089. [Google Scholar] [CrossRef] - White, J.C.; Coops, N.C.; Wulder, M.A.; Vastaranta, M.; Hilker, T.; Tompalski, P. Remote sensing technologies for enhancing forest inventories: A review. Can. J. Remote Sens.
**2016**, 42, 619–641. [Google Scholar] [CrossRef] - Breiman, L. Random forests. Mach. Learn.
**2001**, 45, 5–32. [Google Scholar] [CrossRef] - Congalton, R.G.; Green, K. Assessing the Accuracy of Remotely Sensed Data: Principles and Practices; CRC Press: Boca Raton, FL, USA, 2008. [Google Scholar]
- Fahrmeir, L.; Kneib, T.; Lang, S.; Marx, B. Regression: Models, Methods and Applications; Springer Science & Business Media: Heidelberg/Berlin, Germany, 2013. [Google Scholar]
- Wilcoxon, F.; Katti, S.; Wilcox, R.A. Critical values and probability levels for the Wilcoxon rank sum test and the Wilcoxon signed rank test. Sel. Tables Math. Stat.
**1970**, 1, 171–259. [Google Scholar] - Kirchhoefer, M.; Schumacher, J.; Adler, P.; Kändler, G. Considerations towards a Novel Approach for Integrating Angle-Count Sampling Data in Remote Sensing Based Forest Inventories. Forests
**2017**, 8, 239. [Google Scholar] [CrossRef] - ESA. Sentinel-2 Earth Observation Mission, 2017. Available online: https://sentinel.esa.int/web/sentinel/missions/sentinel-2 (accessed on 29 March 2017).
- Massey, A.; Mandallaz, D. Design-based regression estimation of net change for forest inventories. Can. J. For. Res.
**2015**, 45, 1775–1784. [Google Scholar] [CrossRef]

**Figure 1.**(

**Left**): Study area with delineated FA forest management units. (

**Right**): Example for each of the three management units (from top to bottom): FA, FR and forest stand unit overlayed with the extended double-sampling cluster design. Green: Forest stand polygon layer defining the state and municipal forest area of this study.

**Figure 2.**(

**Left**): CHM (top) and tree species classification map (bottom) available on the federal state level. (

**Right**): Magnified illustration of the supports used to derive the explanatory variables from the auxiliary data. From top to bottom: CHM, aerial image, tree species classification.

**Figure 3.**Cumulative distribution of estimation errors under SRS, PSMALL, EXTPSYNTH and the PSYNTH estimator. (

**Left**): Results for the 45 FA units. (

**Right**): Results for the 321 FR units.

**Figure 4.**(

**Left**): Comparison of the g-weight variance between the PSMALL and the EXTPSYNTH estimator for the 321 FR units. (

**Right**): Difference in g-weight variance between the PSMALL and the EXTPSYNTH estimator in dependence of the terrestrial data (${n}_{2,G}$) in the FR unit.

**Figure 5.**Cumulative distribution of variance reduction by the PSMALL and EXTPSYNTH compared to the SRS estimator for the 45 FA and 321 FR units.

**Figure 6.**Share of the overall variance by the residual term of the PSMALL estimator for various small area sample sizes. Points are scaled by the overall percentage reduction/increase of the variance compared to SRS.

**Table 1.**Descriptive statistics of the forest observed on NFI sample plots located within communal and state forest area (${n}_{2}$ = 5791).

Variable | Mean | SD | Maximum |
---|---|---|---|

Timber Volume (m${}^{3}$/ha) | 300.9 | 195.6 | 1375.3 |

Mean DBH (mm) | 354.9 | 137.2 | 1123.2 |

Mean height (dm) | 239.6 | 72.4 | 497.4 |

Mean stem density per hectare | 101 | 114 | 1010 |

**Table 2.**Sample size for each phase in entire study area. ${n}_{\{1,2\},plot}$: number of plots. ${n}_{\{1,2\}}$: number of clusters. TSPEC: tree species map information.

Sampling Frame | ${\mathit{n}}_{1,\mathit{plot}}$ | ${\mathit{n}}_{1}$ | ${\mathit{n}}_{2,\mathit{plot}}$ | ${\mathit{n}}_{2}$ |
---|---|---|---|---|

municipal and state forest | 96,854 | 33,365 | 5791 | 2055 |

missing CHM | 18 | 10 | 0 | 0 |

missing TSPEC | 7060 | 3587 | 414 | 385 |

missing CHM and TSPEC | 3 | 2 | 0 | 0 |

missing CHM or TSPEC | 7075 | 3595 | 414 | 385 |

**Table 3.**Classification accuracies of the treespecies variable before and after calibration. ${n}_{ref}$: number of terrestrial reference plots. ${n}_{class}$: number of classified plots.

Main Plot Species | Producer’s Accuracy [%] | User’s Accuracy [%] | ${\mathit{n}}_{\mathit{ref}}$ | ${\mathit{n}}_{\mathit{class}}$ |
---|---|---|---|---|

Beech | 22.3 | 47.0 | 883 | 419 |

Douglas Fir | 24.8 | 48.7 | 230 | 117 |

Oak | 11.1 | 48.5 | 289 | 66 |

Spruce | 53.2 | 61.1 | 651 | 566 |

Scots Pine | 22.9 | 46.1 | 179 | 89 |

Mixed | 84.5 | 64.5 | 3152 | 4127 |

Overall accuracy: 62.0% | 5384 | 5384 |

**Table 4.**Model fit specifications for the two OLS regression models on the cluster level. Interaction terms are indicated by ’:’. () give the respective values on the plot level.

Model Terms | Model | ${\mathit{R}}_{\mathit{adj}}^{2}$ | RMSE | RMSE% |
---|---|---|---|---|

meanheight + stddev + meanheight${}^{2}$ + | full model | 0.58 | 90.11 | 29.76 |

treespecies + ALSyear + | (0.48) | (139.22) | (45.98) | |

meanheight:treespecies + | ||||

meanheight:ALSyear + meanheight:stddev + | ||||

stddev:ALSyear | ||||

meanheight + stddev + meanheight${}^{2}$ + | reduced model | 0.55 | 95.23 | 31.65 |

ALSyear + meanheight:ALSyear + | (0.45) | (144.13) | (47.60) | |

meanheight:stddev + stddev:ALSyear |

**Table 5.**${R}^{2}$, RMSE and RMSE% on the cluster level of the full regression model within ALS acquisition year strata (ALSyear). $Are{a}_{ALSyear}$: Area covered by ALS acquisition given in km${}^{2}$. n: sample size of validation data. () give the respective values on the plot level.

ALSyear | ${\mathit{Area}}_{\mathit{ALSyear}}$ | ${\mathit{R}}^{2}$ | RMSE | RMSE% | n |
---|---|---|---|---|---|

2012 | 2807 | 0.65 | 98.52 | 29.62 | 156 |

(0.61) | (135.84) | (44.87) | (408) | ||

2011 | 4361 | 0.60 | 96.89 | 29.66 | 354 |

(0.57) | (146.21) | (48.29) | (883) | ||

2010 | 4182 | 0.64 | 76.38 | 27.57 | 420 |

(0.51) | (120.90) | (39.93) | (1171) | ||

2009 | 2100 | 0.53 | 92.22 | 33.31 | 218 |

(0.42) | (133.42) | (44.07) | (559) | ||

2008 | 2968 | 0.61 | 87.10 | 32.20 | 247 |

(0.48) | (130.38) | (43.06) | (701) | ||

2008_1 | 2116 | 0.43 | 117.99 | 33.64 | 157 |

(0.33) | (175.43) | (57.94) | (394) | ||

2007 | 3498 | 0.56 | 82.43 | 26.57 | 135 |

(0.46) | (136.47) | (45.08) | (418) | ||

2003 | 602 | 0.34 | 85.92 | 27.31 | 145 |

(0.27) | (154.48) | (51.02) | (529) | ||

2002 | 775 | 0.52 | 87.25 | 27.22 | 97 |

(0.44) | (141.55) | (46.75) | (314) |

**Table 6.**Descriptive summary of point estimates and estimation errors for the volume of growing stock per area unit [m${}^{3}$/ha] on the two forest district levels. ${N}_{u}$: number of evaluated small area units.

District Level | Estimator | Point Estimates | Error${}_{[\%]}$ | |||||
---|---|---|---|---|---|---|---|---|

Mean | Min | Max | Mean | Min | Max | |||

FA | SRS | (${N}_{u}$ = 45) | 300.16 | 215.91 | 392.84 | 6.69 | 3.87 | 13.21 |

PSMALL | (${N}_{u}$ = 45) | 307.29 | 209.26 | 417.10 | 5.16 | 3.46 | 14.33 | |

EXTPSYNTH | (${N}_{u}$ = 45) | 307.27 | 209.01 | 415.02 | 4.78 | 3.25 | 13.88 | |

PSYNTH | (${N}_{u}$ = 45) | 306.90 | 223.51 | 409.92 | 2.34 | 1.54 | 3.95 | |

FR | SRS | (${N}_{u}$ = 321) | 302.77 | 99.89 | 552.87 | 16.94 | 2.76 | 55.51 |

PSMALL | (${N}_{u}$ = 321) | 308.15 | 159.64 | 568.67 | 12.24 | 3.48 | 44.94 | |

EXTPSYNTH | (${N}_{u}$ = 321) | 308.38 | 154.07 | 544.34 | 11.34 | 3.60 | 40.91 | |

PSYNTH | (${N}_{u}$ = 321) | 305.56 | 197.47 | 444.29 | 4.13 | 2.56 | 18.07 |

**Table 7.**Descriptive summary of variance reduction compared to SRS and relative efficiencies on the two forest district levels. ${N}_{u}$: number of evaluated small area units.

District Level | Estimator | Variance Reduction [%] | Relative Efficiency | |||||
---|---|---|---|---|---|---|---|---|

Mean | Min | Max | Mean | Min | Max | |||

FA | PSMALL | (${N}_{u}$ = 45) | 33.51 | 2.6 | 72.5 | 1.74 | 1.03 | 3.64 |

EXTPSYNTH | (${N}_{u}$ = 45) | 43.30 | 15.7 | 75.8 | 2.03 | 1.18 | 4.13 | |

FR | PSMALL | (${N}_{u}$ = 321) | 12.48 | $-1203.9$ | 96.8 | 2.54 | 0.08 | 31.61 |

EXTPSYNTH | (${N}_{u}$ = 321) | 24.75 | $-892.7$ | 97.0 | 2.95 | 0.10 | 33.70 |

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).