Generalized Hierarchical Model-Based Estimation for Aboveground Biomass Assessment Using GEDI and Landsat Data

: Recent developments in remote sensing (RS) technology have made several sources of auxiliary data available to support forest inventories. Thus, a pertinent question is how different sources of RS data should be combined with ﬁeld data to make inventories cost-efﬁcient. Hierarchical model-based estimation has been proposed as a promising way of combining: (i) wall-to-wall optical data that are only weakly correlated with forest structure; (ii) a discontinuous sample of active RS data that are more strongly correlated with structure; and (iii) a sparse sample of ﬁeld data. Model predictions based on the strongly correlated RS data source are used for estimating a model linking the target quantity with weakly correlated wall-to-wall RS data. Basing the inference on the latter model, uncertainties due to both modeling steps must be accounted for to obtain reliable variance estimates of estimated population parameters, such as totals or means. Here, we generalize previously existing estimators for hierarchical model-based estimation to cases with non-homogeneous error variance and cases with correlated errors, for example due to clustered sample data. This is an important generalization to take into account data from practical surveys. We apply the new estimation framework to case studies that mimic the data that will be available from the Global Ecosystem Dynamics Investigation (GEDI) mission and compare the proposed estimation framework with alternative methods. Aboveground biomass was the variable of interest, Landsat data were available wall-to-wall, and sample RS data were obtained from an airborne LiDAR campaign that produced simulated GEDI waveforms. The results show that generalized hierarchical model-based estimation has potential to yield more precise estimates than approaches utilizing only one source of RS data, such as conventional model-based and hybrid inferential approaches.


Introduction
For several reasons, society's interest in forests and forestry is increasing. Forests play a key role in ambitions to mitigate climate change through moving from fossil-based economies to bio-based economies. They provide a long list of ecosystem services, including human recreational and economic uses. As a result, there is an increasing worldwide need for information about the state and change of forest resources and forest environmental conditions. One important example is the United Nations' Framework Convention on Climate Change (UNFCCC), which has become an important driver of forest inventory development since it highlights that forests are potentially huge sources or sinks of carbon dioxide and, thus, the agreements under UNFCCC require recurrent reporting of forest carbon pool changes [1]. Another example is the Forest Europe process, which relies on forest information for promoting sustainable management of European forests [2].
A major challenge is to develop methods to collect forest information cost-efficiently, without sacrificing information quality. Typically, National Forest Inventories (NFIs) based on field sample plots provide forest information at regional and national level. However, the ongoing development of earth observation technologies is rapidly improving, and the scope, quality and availability of the data provided offer substantial new possibilities for forest inventories e.g., [3]. Over the last few decades, several studies have been conducted where remote sensing (RS) and field data have been combined in order to enhance the precision of large-scale field based inventories, or to make forest surveys feasible in remote areas where field sampling is very costly.
When conducting a forest inventory, RS data can be incorporated at either the design stage or the estimation stage. At the design stage, RS data can be used for stratification e.g., [4], unequal probability sampling e.g., [5], or balanced sampling e.g., [6]. To utilize RS data in the estimation stage, either model-based inference [7][8][9], or model-assisted estimation including post-stratification [10], i.e., design-based inference, can be applied.
The vast availability of RS data also opens new possibilities for improving estimation efficiency by using combinations of several sources of RS data. While this can be achieved straightforwardly in the case of model-assisted estimation following established sampling theory e.g., [5,[11][12][13], this issue has been less explored for model-based inference for the case when auxiliary data are not available for the entire population. An important case is when wall-to-wall RS data (or a large sample) are complemented by a (sparse) sample of RS data that are strongly correlated with the forest attribute variable of interest. While several studies have utilized this type of combination of RS data for model-based inference e.g., [14][15][16], first steps towards a rigid statistical framework for this type of surveys were taken by Saarela et al. [17] and Holm et al. [18]. The current study is a generalization and expansion of those studies. The proposed technique has been termed hierarchical model-based estimation (ibid.) since in a typical case the data sources are nested. Commonly, the ambition is to utilize wall-to-wall (or a large sample of) RS data for the inference about population characteristics within a large study area. Since field data are expensive or may not be possible to obtain from all parts of the area, a sample of RS data is selected, and a model linking the study variable (from field plots) with sampled RS data is established. A key issue in hierarchical model-based estimation is that this model must provide precise predictions of the variable of interest. Thus, the sampled RS data should provide more accurate model predictions than the RS data available wall-to-wall. For example, in a biomass survey the wall-to-wall RS data might be obtained from the Landsat satellite and the sampled RS data from airborne laser scanning cf. [19]. Model predictions across the sampled set of RS data are used for estimating a second model, linking the variable of interest with wall-to-wall RS data. The latter model is used for the model-based inference, but in the uncertainty assessment both modeling steps must be accounted for [17,18,20].
Hierarchical model-based estimation can be used also in cases where data are not nested, as an approach to make use of a sparse network of existing field data. In this case a small (possibly purposive) sample of RS data is selected to link field data with a large sample of RS data, or RS data available wall-to-wall. This approach was pioneered by Boudreau et al. [14] and Nelson et al. [15], who used a combination of the Portable Airborne Laser System (PALS) and data from ICESat/GLAS for estimating aboveground biomass (AGB) in Québec, Canada. A similar approach was applied in a study by Neigh et al. [16] for assessment of forest carbon stock across the entire boreal forest region. However, these studies ignore parts of the models' contribution to the overall uncertainty of the biomass estimators. In Saarela et al. [17] it was shown that the studies might have underestimated the variance by up to 70%. However, the study by Saarela et al. [17] was conducted under simplifying assumptions, such as assuming all samples being conducted through simple random sampling and assuming that all models involved had homogeneous residual variance.
The main objective of this study is to generalize the existing estimators for hierarchical model-based estimation so that this estimation method can be applied also in cases when the model errors have non-homogeneous variance and cases where the errors are correlated, for example due to clustered sampled data. A further objective is to compare the proposed estimation method with existing methods, using data from study sites in the USA that mimic the type of data that will be available from the Global Ecosystem Dynamics Investigation (GEDI) mission [21]. The proposed generalization of existing theory in Saarela et al. [17] is important, since both field and intermediate RS data in most practical cases are clustered e.g., [22] and many models between RS data and biophysical features, such as biomass, have non-homogeneous variance e.g., [18,23].

Overview
In this section we first derive and present the theory for generalised hierarchical model-based (GHMB) estimation. In the next section, we then present the data from six study sites in the USA, which were used for assessing the performance of the new estimators. A key issue was to assess what precision can be expected if this estimation framework is used with a combination of field, GEDI and Landsat 7 Enhanced Thematic Mapper Plus (ETM+) data. Because GEDI data are not available yet, we used GEDI-like waveforms simulated from airborne small-footprint LiDAR (see Section 3.1.1 for GEDI data description).
We compared the performance of GHMB estimation with: (i) two-stage model-based estimation described in Holm et al. [18], based on GEDI, Landsat 7 ETM+ and field data; (ii) hybrid estimation [24], which utilizes only the sampled GEDI data and field data; and (iii) conventional model-based estimation e.g., [25], which utilizes only wall-to-wall Landsat 7 ETM+ data and field data.

Generalized Hierarchical Model-Based Estimation (GHMB)
In model-based inference, the vector of N population values is seen as a realization from a vector random variable y = (y 1 , . . . , y N ) with a given joint probability distribution [26]. The expected value of If the size N of the target population, U, is large, we can assume that µ ≈ȳ (whereȳ is the population mean of the given realization) is a good approximation, and instead of predicting the target population meanȳ, we estimate its expectation µ e.g., [27].
The joint distribution G of the vector random variable y = (y 1 , . . . , y N ) is denoted a superpopulation model cf. [26]. The GHMB estimation approach relies on two superpopulation models of the class of multiple regression models. The first superpopulation model links target population element values y with predictor variables X (including a column of units) assumed to be strongly correlated with y. The second superpopulation model links y with predictor variables Z (including a column of units) assumed to be weakly correlated with y. In our example the variable of interest, y, is AGB, and X and Z are GEDI and Landsat 7 ETM+ data, respectively (see Section 3.1 for the data description). On this basis we denote the first superpopulation model G X and the second G Z . Since, G X and G Z describe the same joint distribution of y = (y 1 , . . . , y N ), they relate as µ|G X = µ|G Z , which is an essential basis for GHMB estimation.
In our study, G X and G Z are assumed to have a similar form and follow similar distributional properties, and thus, for our target population U we have: and here, β and α are vectors of model parameters to be estimated, and υ are random error vectors, and ω 2 Ω and θ 2 ∆ are the variance-covariance matrices with diagonal elements corresponding to individual error variances and off-diagonal elements to covariances between individual errors. Three dataset are required for GHMB estimations.
• The first dataset, denoted S, contains a sample of field data for which sampled RS data are also available. The dataset is used for estimating the model parameters β. Each estimator based on this set is given the subscript S; the data S comprise n field observations. • The second dataset, denoted Sa, contains the enlarged sample of RS data and the corresponding RS data from the wall-to-wall dataset. It is used for estimating the model parameters α, and any estimator based on this set has the subscript Sa; the data Sa comprise M sampled RS observations. • The third dataset contains the wall-to-wall RS data for the entire target population, U. The target population U comprises N population elements.
With a nested structure of data, Sa is a sample of U, and S is a sample of Sa. However, as we stated in the introduction, the datasets do not have to be nested, and the sample S may be selected independently from Sa. Figure 1 provides an overview of GHMB estimation. We estimated β by generalized least squares (GLS) estimators usinf dataset S e.g., [28]: Conditions E [ ] = 0 and E [υ] = 0 in G X and G Z , respectively, imply that for the superpopulation model G X , E [y|X] = Xβ and for the superpopulation model G Z , E [y|Z] = Zα. This does not imply that E [y|X] equals E [y|Z] for a given realization of X and Z. For example, assume tree biomass is modeled based on either tree diameter (X) or tree height (Z) these two models will be very different, and in formal terms this is because the models provide the expected values conditional on the different predictor variables. This difference between the two models may be more subtle when the models are based on different types of RS data, but the principle remains the same: the two models will be different. Thus, for a given realization of X and Z, y|X = y|Z, but E [y|X] = E [y|Z]. Using the models (1) and (2) we obtain Xβ + = Zα + υ, and thus, where u = υ − is a random variable. It can be seen thatxβ =zα (x andz are vectors of average values) and thus, the relationship agrees with µ|G X = µ|G Z . The relationship (4) is employed to estimate the model parameters α using the dataset Sa and y Sa = X Sa β S , i.e., The variance-covariance matrices ω 2 Ω and σ 2 Σ are among the essential parameters to be estimated. Under assumptions of homoskedasticity and independence, ω 2 Ω and σ 2 Σ become ω 2 I and However, under heteroskedasticity and dependent observations the form of ω 2 Ω and σ 2 Σ is different. Under heteroskedasticity, the matrices' diagonal elements, corresponding to individual error variances, are unequal. Dependent errors may arise due to several reasons, important examples are clustered sample data, spatially autocorrelated errors and nested samples: • with clustered data structure, Ω and Σ are block-diagonal matrices, where the blocks correspond to the clusters; • under spatial autocorrelation, the matrices' off-diagonal elements, corresponding to covariances between errors, are non-zero; • with nested samples, the dependency between datasets S and Sa results in a non-zero covariance between errors of model (1) and relationship (4).
In the present study we didn't account for the uncertainty due to the estimation of ω 2 Ω and σ 2 Σ, and assumed that they are known to a constant, i.e., ω 2 Ω S = ω 2 Ω S and σ 2 Σ Sa = σ 2 Σ Sa . This is a common practice when GLS estimators are applied cf. [28,29]. Therefore, since the ω 2 and σ 2 terms will cancel, estimators (3) and (5) can be rewritten as and The difference between (3) and (6), and (5) and (7) is that the covariance matrices are not estimated in (6) and(7) but assumed known cf. [28,29].
The expected value of the finite population mean is then estimated as where the subscript GHMB denotes "Generalized Hierarchical Model-Based" and ι U is an N-length vector, each element of which is 1 /N. The variance of µ GHMB is where Cov ( α Sa ) (see Appendix A for details) is the core expression for the GHMB estimator variance. As shown in the Appendix A the covariance composed of four terms. where In the general case all these terms would be non-zero and contribute to the variance of the GHMB estimator. However, in case the S and Sa datasets are independent, then E [bc ] = E [cb ] = 0. By replacing ω 2 and σ 2 with the corresponding estimators ω 2 and σ 2 we obtain the covariance estimator Cov ( α Sa ), i.e., where ω 2 in model (1) is estimated as e.g., [28] whereas σ 2 due to relationship (4), was estimated as where SSR(β, α|Σ Sa is needed because the response variable in relationship (4) is not measured but estimated using estimator (6).
It can be seen that the estimated covariance of β S is a part of Cov ( α Sa ) by substituting in estimator (11) ω 2 X S Ω S X S −1 to Cov β S e.g., [28]. The derivation of estimators (11) and (13) are The R package "HMB" by Saarela et al. [30] has function ghmb(), which provides estimates based on estimators (8) and (14). The package is based on a C++ library for linear algebra developed by [31].

Reference Methods for Comparison
We compared the performance of the GHMB estimator with the two-stage model-based estimation procedure presented in [18], which utilizes the same datasets. Additionally, we compared the GHMB estimator with estimators utilizing only a single source of RS data, i.e., an estimator for hybrid inference [24], and an estimator for conventional model-based inference e.g., [17]. Some details of these reference methods are provided below.

Generalized Two-Stage Model-Based Estimation (GTSMB)
In this estimation procedure, the regressors X in the model G X , are regressands and dependent on Z, i.e., for the kth variable out of (p + 1) variables in X. The same set of Z is used to predict each variable in X. Model parameters γ k are estimated using information from the Sa dataset employing the GLS cf. [28]. Similarly to GHMB estimation, we do not account for the uncertainty due to the Φ k estimation, and, thus, assume δ 2 k Φ k to be known to a constant In this approach the expected value of the finite population mean, µ, is estimated as where x U k = Z U γ k and the subscript GTSMB denotes "Generalized Two-Stage Model-Based". The variance of µ GTSMB is estimated as where . Details of the estimator (18) are presented in [18]. The R package "HMB" by Saarela et al. [30] was used to obtain estimates based on (17) and (18).
The correspondence between GHMB and GTSMB estimators were further evaluated as a part of this study. In Appendix B we show that under certain rather general conditions the estimators and variance estimators will provide approximately the same results for given datasets.

Hybrid Estimation
We employed the hybrid estimators for clustered data described in [24]. The estimator utilizes sampled X data from the Sa dataset, and accounts for sampling uncertainty due to the Sa sample and modelling uncertainty due to the model (1) (i.e., y = Xβ + ) using the dataset S [24].
Within hybrid inference, the population mean estimator is [24] (Equation (11), p. 101) where m is the number of clusters in Sa, x it β S is the ith cluster total, and A i is the number of population elements in the ith cluster.

Conventional Model-Based Inference (MB)
This estimation procedure uses only one source of auxiliary information available wall-to-wall, in our case the Landsat 7 ETM+ data. The population mean estimator is e.g., [17] (Equation (3), p. 899). where S y S is a vector of estimated model parameters using the dataset S following superpopulation model G Z in Equation (2). As in the previous approaches, in this case we also assume that the covariance θ 2 ∆ S is known to a constant θ 2 , i.e., θ 2 ∆ S = θ 2 ∆ S .
The model-based variance estimator is where Cov cf. [28].

Material
For this study we simulated six populations mimicking forest conditions in study areas across the USA ( Figure 2). For each site reference data were available from field measurements, Landsat 7 ETM+, and laser scanning using a method that mimics future measurements with the GEDI space LiDAR. In this section we first describe the datasets and, secondly, how these reference data were used for simulating AGB and RS data for each study site.

Simulated GEDI Data
GEDI's nominal footprint diameter will be approximately 25 m, with approximately 60-m along-track spacing [21]. After two years of operation aboard the International Space Station (ISS), the cross-track spacing of GEDI's sample is expected to be 500 m, producing a semi-regular lattice of sample lines from 51 • South to 51 • North (the area of the Earth traversed by the ISS orbit). Waveform properties used in the modeling process were quantified by relative heights (rh) from below which different amounts of energy were reflected. So, 'rh90' was the height below which 90% of the waveform energy returned, which was higher by an amount dictated by the vertical distribution of canopy material than 'rh30'.
The LiDAR data used in this study were derived by the GEDI Science Team from airborne discrete-return LiDAR (DRL) acquisitions, transformed to resemble the return waveforms to be collected by the GEDI mission. Further details related to LiDAR and field data acquisition are given in Appendix C. The transformation process is one where individual photon returns from across a given surface area are grouped into height quantiles, which are then integrated to derive a waveform describing the return height function for that area [32]. The GEDI Waveform Simulator realistically accounts for topographic and canopy penetration issues, while effecting the above physical transformation of return energy information from discrete to continuous functions.
The DRL acquisitions forming the basis of this study were collected with a RIEGL LMS-Q680i airborne laser scanner across the six study sites (Maine, Pennsylvania, South Carolina, Colorado, Minnesota) between June and August 2014, the Oregon site was scanned in June 2015. Data were collected in North-South lines that were 300-1000 m in width and spaced 5 km apart. Pulse density was at least 4 pulses per square meter. The DRL sample lines were transformed to simulate a surface of contiguous GEDI footprints along the sample lines. Ten strips of GEDI data, each about 510 m long, were available from each site [33].

Landsat 7 ETM+ Data
Landsat values were derived for each field plot from surface reflectance values (multiplied by 10 4 ) generated from the LEDAPS algorithm [34]. Pre-collection surface reflectance imagery from June to September 2015 for band 3 (red; B3) and band 4 (near-infrared; B4) were composited using a medoid method (multi-dimensional median; [35]), screening out clouds and cloud shadows using the F-mask algorithm [36]. This processing was conducted on the Google Earth Engine platform [37].

Field Data
Between 46 and 50 field plots were established in each of the six study sites from June-August, 2014 (Maine, Pennsylvania, South Carolina, Colorado, Minnesota) and from June-August 2015 (Oregon, see Appendix C for more details). The sample was designed to cover the range of vegetation conditions in each area, making use of 15 strata covering the bivariate distribution of LiDAR-based estimates of height (broken into 5 classes) and vegetation cover (3 classes). The sampled distribution excluded locations with less than 10% canopy cover (according to LiDAR metrics) and areas of non-forest according to the National Land Cover Dataset [38]. Approximately 25 random locations were chosen within each stratum, from which an approximately equal number (3)(4) were chosen for field measurement based upon accessibility.
Standard field protocols used by the US National Forest Inventory (managed by the US Forest Service Forest Inventory and Analysis program) were used in measuring trees on 16.15 m radius plots (yielding approximately the same area plot as a Landsat pixel) [39]. For details see Appendix C. AGB was calculated from the tree list for each plot as described in [39]. Plot numbers, forest types according to [40,41] and statistics are summarized in Table 1.

Simulated Populations
Each of the six areas depicted in Figure 2 were simulated independently; each contained approximately 50 field plots, with corresponding Landsat 7 ETM+ and GEDI data (simulated from DRL). In general, the target for the GEDI mission is to report AGB estimates by 1-km squares. Thus, the setup will be according to Figure 3a, i.e., the area is tessellated into 1-km grid-cells, each of which will have a certain number of GEDI footprints (the points) and potentially a wall-to-wall cover of Landsat data.  Our case study examples are of two kinds. In Case A, only data (GEDI and Landsat 7 ETM+) from within a given 1-km grid-cell are used. In Case B, data from eight neighboring grid-cells were used in addition to the center (target) grid, since model development data from a larger similar area may improve GHMB estimation. Note, that only the GHMB and GTSMB approaches can benefit from data from surrounding grid-cells, since such data can be used to improve the model used in the final step of the GHMB (and GTSMB) procedure, that is, developing a model linking predicted biomass values (based on GEDI data) with wall-to-wall Landsat data, and in the case of GTSMB procedure developing a multivariate model linking GEDI variables with Landsat data. The conventional MB and the hybrid methods cannot benefit from including a support area of this kind. The two cases are shown in Figure 3b,c.
Simulation were made to provide the data needed for Case B; Case A data were obtained as the corresponding data from the center 1-km grid-cell. Each 1-km grid-cell was tessellated into 2500 square plots of a size corresponding to the future GEDI LiDAR footprint size.

Correlated Multinomial Random Variables
To create our simulated populations mimicking forest conditions in the study sites we need correlation matrices between the AGB, GEDI and Landsat 7 ETM+ variables. The matrices were calculated using reference data and are given in Table 2. We applied a classical method for generating correlated multivariate normal random variables for a given covariance matrix, i.e., where, a = (a 1 , . . . , a t ) is the desired multivariate normal vector of t variables (in our case study the variables are AGB, rh60, rh90, B3 and B4), f = ( f 1 , . . . , f t ) is a row vector of independent standard normals, µ = (µ 1 , . . . , µ t ) is the vector of corresponding means, and L is a triangular matrix such that LL = Cov (a k , a l ). We applied Cholesky decomposition to derive the matrix L. The covariance matrix Cov (a k , a l ) = Corr (a k , a l ) sd (a k ) sd (a l ) and the expected values µ were estimated using reference data for each study site. Table 3 provides sd (a k ) and expected values µ for each study site. The independent vector of standard normal random variables f was generated randomly for a given simulated spatial autocorrelation of exponential form as a function of spatial distances between square plots, i.e., ρ i,j = e −a(distance i,j ) . The spatial autocorrelation values are presented in Table 4. We emphasis that for purposes of this study we assumed AGB autocorrelation as an average of GEDI rh60 and rh90 autocorrelation values (estimated using GEDI simulated strip data), given strong correlation between AGB and GEDI variables. This is an important point that the AGB autocorrelation assessment was not a part of the computation process.

Regression Modeling
For each study site five regression models were fitted. Table 5 gives the model description and Table 6 provides information on the degrees of freedom, which were employed in the regression models involved for the different estimation methods. Dataset S was randomly selected from the square plots belonging to the target population U (1-km grid-cell) and its support area (eight neighboring 1-km grid-cells) corresponding to Case B, i.e., from the 3 × 3 km area (see Figure 3c).

Evaluation Criteria
We used the relative standard error (rSE) to assess the precision of the estimators where µ is the expected value of population mean and in our study cases is the mean value of AGB over field plots (see Table 3). We also analyzed the uncertainty contribution of different sources in GHMB and hybrid estimation, i.e., for GHMB uncertainty contribution (i) due to the AGB-GEDI model, and (ii) due to the AGB-Landsat (Sa) model; for hybrid uncertainty contribution (i) due to the modeling, and (ii) due to the sampling. We estimated proportions of each uncertainty contribution, i.e., for GHMB estimation the proportions are estimated as (estimator (11) with substituting ω 2 X S Ω S X S −1 to Cov β S , and estimator (14)): for hybrid estimation the proportions are estimated as (estimator (20)): Ideally the performance of estimators and variance estimators should have been evaluated through Monte Carlo simulations with repeated simulation of both the populations and selection of datasets S and Sa, and the estimations. However, as seen from the formulas (11) and (14) the variance estimator is conditional on the sample outcomes X S and Z Sa . Thus different simulated samples will result in different variances. It is the random deviations and u that causes uncertainty according to the formulas. Therefor simulations that result in different X S and Z Sa cannot strictly be used to compare the empirical variance with the mean estimated. The simulated variances should be seen as examples of the size. For this reason we have chosen a restricted number of simulations. Simulations of only and u could be used for empirical studies, but then (for a given set of simulations) only for a single fixed sample X S and Z Sa .

Results
In Table 7, the expected population mean AGB values, µ, the corresponding simulated mean values, y U , and the estimates for the GHMB and GTSMB (Cases A and B), hybrid, and MB methods are presented. It can be seen that the AGB varied quite substantially between the different sites with the highest value in Oregon and the lowest in Minnesota. The average of estimated values over 250 repetitions were always fairly close to the average of simulated true means, y U . Our assessment of the performance of the different methods is based on estimated variances, recalculated and expressed as relative standard errors, for each method in each of the study areas as well as on average across the different study areas.
In Figure 4, the average relative standard error for the different methods across the study sites is presented. In this case a relative variance was first estimated for each site; then an average for the sites was calculated from which the square root was computed. On average, the GHMB and GTSMB methods performed about equally well (Appendix B), and they were superior to the other methods in case data from neighboring grid cells were used for improving the models (case B). When data from the target grid cell, only, were used for the model building (case A), the GHMB and GTSMB methods were outperformed by the hybrid and conventional MB estimation methods.
Relative standard error for each of the methods in each site is detailed in Table 8. It can be seen that the performance of the methods, in terms of relative standard error, varied substantially between the sites. The hybrid estimation method showed fairly consistent results, in terms of precision, across the different sites with the smallest relative standard error in Colorado and the largest in Pennsylvanian and New Jersey border. In this simulation study we were not able to indicate any specific relation between estimators' performance and forest types. To conduct such analysis, real life data would be required rather than simulated. The MB method performed almost as well as hybrid estimation with similar patterns across study sites, and GHMB and GTSMB Case A estimations. The GHMB and GTSMB methods typically decreased their relative standard errors by about 40% when data from surrounding grid-cells were applied in the model building.   Table 9 shows the contribution of different sources of uncertainty to the variance GHMB and hybrid estimation methods. It can be seen that the contribution of the AGB-GEDI estimated model uncertainty is substantial for both estimation methods; for GHMB Case A it varies between 19% (MN) and 47% (CO), in Case B between 54% (CO) and 81% (MN), and in hybrid estimation between 24% (ME) and 60% (PANJ). Comparing Cases A and B in GHMB estimation, we can see that with an increased number of GEDI footprints (from 94 to 727), the uncertainty contribution of the AGB-Landsat (Sa) estimated model decreased.

Discussion
In this article the hierarchical model-based estimation method presented by Saarela et al. [17] has been extended to cases when the model errors have non-homogeneous variance and cases where the errors are correlated, for example due to clustered sample data and/or spatial autocorrelation. A main component of this development is that generalized least squares theory is applied for the parameter estimation in the regression analysis, in which case the parameter estimation and the corresponding estimation of the covariance matrix for the parameter estimates accommodate such data structures. Also, the similarities between the GHMB and the GTSMB methods [18] are further explored and a proof is given (Appendix B) that the two methods will provide identical estimates and variances, and almost identical variance estimates, under certain conditions. However, with the expansion of the GHMB theory presented in this article, the GHMB method has a potential to be applied under a wider range of conditions than the GTSMB method, which among other things assumes independence between the two datasets used for the model building. The GHMB method might also be considered more intuitive to apply since it directly predicts the reference data (AGB values) that are used for the model building in the final step. The GTSMB method, on the other hand, uses the wall-to-wall RS data for predicting the metrics that would have been obtained from the sampled RS data source.
The results from the case studies indicate that the GHMB and GTSMB methods lead to more precise results, when a large support area is used for developing the model linking the wall-to-wall dataset (simulated Landsat data in our case) with the model predictions from the sample RS data (simulated GEDI data in our case). Restricting the data for the model building to a smaller area decreases the precision of the two methods. The hybrid estimation method and the conventional MB method were superior to the GHMB and GTSMB methods when only a 1-km support area was used. With the hybrid method, the AGB predictions from the GEDI sample, only, are used for the inference in our case. The conventional MB method makes predictions for all units in the wall-to-wall dataset (simulated Landsat data). Note, that Case B is only relevant for GHMB and GTSMB, since the dataset S (the field sample plots) was assumed to have a fixed size regardless of the size of the target area, and thus, the models used for the MB estimation would be the same in Case A and Case B. Thus "borrowing strength" for the model development from surrounding grid-cells does not change anything in the cases of conventional MB estimation. This holds true also for hybrid estimation, based on the same argument.
Overall, the performance of the different methods depends on many factors. A first requirement is that the study area is large enough, so that the assumption that the superpopulation mean value is approximately the same as the study area mean holds. Further, the goodness-of-fit of the models involved is another core issue. With a very good model linking wall-to-wall data with AGB values, there is no need for complicating the estimation with hybrid, GHMB or GTSMB methods. However, most wall-to-wall RS datasets are weakly correlated with AGB which is the reason why samples of RS data that are strongly correlated with AGB are of interest as additional sources of auxiliary data in more advanced estimation procedures. Lastly, the sample sizes of the S and Sa datasets are important for the precision of the estimators. In general, increasing the sample sizes will increase their precision. Table 9 supports this statement, as it shows that 47 degree of freedom for fitting the AGB-GEDI model resulted in low precision of estimated β and, hence, the uncertainty contribution to the estimated variance in GHMB estimation for Case A was between 38% and 19%, and for Case B between 62% and 87%. The low precision of estimated β lead to the situation that the conventional MB method with only one source of weakly correlated wall-to-wall RS data (simulated Landsat data in our example) slightly outperformed GHMB (and GTSMB) in Case A, where only 94 GEDI footprints were available (see Figure 4).
The niche for GHMB and GTSMB estimation appears to be the important cases where the following requirements are fulfilled: (i) field data are sparse and expensive to acquire, (ii) wall-to-wall RS data (or a large sample of RS data) are available, which can be fairly well fit to the target variable, and (iii) RS data which can be very well fit to the target variable are available, but only samples of such data can be acquired. Since AGB models based on laser data can be expected to be more generally applicable across large regions than AGB models from Landsat data, it should make sense to train local models of the latter kind using pseudo-field data from predictions with the former kind of models, and thus apply GHMB or GTSMB estimation.
Several methodological issues may be brought up in relation to this study: • One important issue is that all the case study data were simulated, based on sparse samples of reference data. The simulations are simplified generalizations of the real world, that leave out many important issues that must be handled in practical surveys, such as delineating forests from non-forest land [42]. In practical applications, land-use maps need to be applied to delineate forests before the GHMB method is employed. • Another important restriction of the present study is that the results are based on estimates from a small number of iterations. A future expanded study should be based on Monte Carlo simulations of both the populations and the sampling from these populations, as a basis for empirical evaluation of the proposed estimators. Another method for simulating the multivariate variable a in Equation (23) might be applied to demonstrate explicitly abilities of the proposed estimators. • Further, future studies on this subject should also deal with the details of how to estimate the correlation matrices of model errors that are required for the GLS regression; in this article these details are only briefly addressed. One of the solutions to this problem could be iterative re-weighted least squares regression methods, such methods are often applied in geostatistical approaches. • The current GHMB estimator is derived under the assumption that the target population is large, i.e., so that the population mean is at least approximately equal to the superpopulation mean.
Modifications of the GHMB estimators for small-area estimation should be addressed in a potential future study. • In the current study we assumed that the regression models involved in the AGB assessment by means of GEDI and Landsat data are linear. However, in reality the relationship between AGB (or growing stock volume) and height-like measures tends to be nonlinear e.g., [23]. A further elaboration of the GHMB method for nonlinear models would be needed to handle such cases. • Lastly, the performance of GHMB method in comparison to other methods should be analyzed as a basis for making recommendations on what method is appropriate under different conditions.
The above issues identify areas of needed research in the context of our findings. Including those issues into the current study would have increased the scope and length of the article considerably, and, thus, we chose to put those issues on hold for future studies.

Conclusions
Design-based inventory methods based on field sampling are limited by the availability of plot data. GHMB (and GTSMB) allows estimation of biomass in areas where there may be none or very limited field data, such as the 1-km grid-cells of interest to the GEDI mission, by taking advantage of multiple levels of RS data. Specifically, field data and high-quality RS data from similar areas outside the domain of interest can be used to calibrate wall-to-wall predictions within the domain of interest using synoptically collected RS data more weakly related to biomass. This paper augments previous hierarchical estimation methods by presenting methods that can be applied in cases where model errors have non-homogenous variance and where the model errors are correlated. Both cases are relevant for use of data from the upcoming GEDI LiDAR mission.

Acknowledgments:
The authors gratefully acknowledge Steven Hancock of the University of Maryland, who simulated the GEDI waveforms, Warren Cohen (US Forest Service), who helped to direct much of the field and LiDAR data collection, Zhiqiang Yang (US Forest Service), who helped in the development of "HMB" R package, and GEDI Science Team. The authors are thankful to the four anonymous Reviewers, whose comments helped to improve the clarity of the article.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A. Generalized Hierarchical Model-Based Estimators
For a realization, i.e., our target population, we have superpopulation models: We also defined the following relationship between the two models for the given realization: For given datasets S, Sa, and U, and matrices Σ Sa and Ω S : Knowing that y Sa = X Sa β S and β S = X S Ω −1 S X S −1 Due to y S = X S β + S : Due to X Sa β = Z Sa α + u Sa : where Under the S⊥Sa assumption, E [bc ] = E [cb ] = 0: e.g., [28] we can rewrite Here we show that the trivial estimator for σ 2 through the sum of squared residuals divided by the degree of freedom, leads to a biased estimation. We also derive an unbiased σ 2 estimator.
We denote SSR(β,α|Σ Sa ) df Sa = σ 2 . The expected value of σ 2 is: Due to X Sa β = Z Sa α + u Sa (for the given realization of X Sa and Z Sa ): Therefore, And thus, the expected value of the sum of squared residuals is (note: here we used that the quadratic form t Jt = Tr [tt J] = Tr [Jtt ] for any column vector t and matrix J), Knowing that E u Sa u Sa = σ 2 Σ Sa , E S S = ω 2 Ω S and E S u S = E u Sa S = Cov ( S , u Sa ), Thus, It can be seen that σ 2 estimator is biased, and thus, to derive an unbiased estimator σ 2 we have to correct σ 2 for the estimated BIAS, i.e., And thus, Under the independence assumption between datasets S and Sa, Cov ( S , u Sa ) = 0, we have Knowing that Cov β S = ω 2 X S Ω −1 S X S −1 we can rewrite estimator σ 2 as

Appendix B. A Comparison between Expectations and Variance Estimators of the Two Methods: GHMB and GTSMB
Below it is shown that the two methods GHMB and GTSMB lead to identical results under certain conditions. Throughout below we neglect the effects of estimating covariance matrices and use notations as if the matrices were known to a constant.
We start with the estimators: for GHMB we have, where, according to estimator (7) For GTSMB we have the same β S = (X S Ω −1 S X S ) −1 X S Ω −1 S y S . The link between GEDI and Landsat data is given by The parameters γ k are estimated from the Sa sample and we have the GLS estimators where Φ −1 k should be understood as Φ −1 k,Sa . From (A17) we obtain the vector of estimated GEDI values, given the Landsat Z ones as the columns of X = Z Γ Sa , where Γ Sa is the matrix with the vectors γ k as columns.
Thus, for GTSMB estimation we have the estimator Comparing (A14) and (A18), we can see that µ GHMB = µ GTSMB , if Φ k,Sa = λ k Σ Sa for some constants λ k , for k = 2, . . . , (p + 1) (A19) and thus This is so because the λ k cancel in (A17), and (A20) multiplied by β S is equal to α Sa . The relation (A19) seems very restrictive. It means that the correlation matrices for the k error terms d k vectors are identical. However, this is not unrealistic as the GEDI variables x k are strongly correlated. Also, this common correlation matrix is assumed to be equal to the correlation matrix of the error terms of the GHMB model. This is not unrealistic either, since the GEDI variables should show a high correlation with field data y and so should the GHMB predictions y. At least, the assumption (A19) is an acceptable approximation. In the homoskedastic GEDI-Landsat case (when Σ and the Φ k are normed unit matrices) the relation (A19) holds automatically, whether the Field-GEDI relation is homoskedastic or not.
Thus, µ GHMB = µ GTSMB under certain conditions, exactly or approximately. Hence, the variances are so too. Still, it remains to see whether or not the two variance estimators are (almost) identical under some conditions. We restrict this study to the case where the S and Sa samples are selected independently. The GHMB variance estimator is then (estimators (11) and (14)) where The GTSMB variance estimator is (estimator (18)) We introduce the matrix P that is the same for all k and that, in accordance with the condition (A19) is assumed to fulfill for some constants ϕ k , k, l = 1, . . . , (p + 1) (P = P Sa throughout). Then γ k = (Z Sa P −1 Z Sa ) −1 Z Sa P −1 x Sa,k since ϕ k cancels (see (A17)), and Thus, the first term in the (A23) equals Due to the assumptions (A19) and (A24) we also have Σ Sa = h 2 P for some constant h 2 . Since h 2 cancels in the second term of estimator (A22), we see that the contribution of this term to V ( µ) GHMB equals the first term in V ( µ) GTSMB . Next, we will show that he first term of V ( µ) GHMB equals to the second one in V ( µ) GTSMB . For this we need to add that condition (A24) also holds for the cross-covariances, i.e., that Φ kl = ϕ kl P (A27) We have, since ϕ k cancels, and from this and (A27) we obtain Cov ι U x U,k , ι U x U,l = δ kl ϕ kl ι U Z U (Z Sa P −1 Z Sa ) −1 Z U ι U . (A28) Hence, by summation, we obtain for the second term of V ( µ) GTSMB (A29) Next we will show that ∑ (p+1) k=1 ∑ (p+1) l=1 β k β l δ kl ϕ kl = σ 2 h 2 (where Σ Sa = h 2 P), exactly or approximately, and then In the hierarchical approach for the given realization of X Sa and Z Sa we have We insert the second stage model (A16) of the two-stage approach, to get X Sa β = Z Sa Γβ + D Sa β.
We identify the fixed and random parts of expression (A33) and see that Written in a clear way, we have (omitting the index Sa) By taking the expectation of the product of the two sides of (A35) with their transposes we obtain Recalling that Σ Sa = h 2 P and Φ kl = ϕ kl P, we have β k β l δ kl ϕ kl P =σ 2 h 2 P. and, thus, by replacing expected values with their estimators, what is exactly the first term of the V ( µ) GHMB . Numerical examples (with simulated data) have shown that the two first terms of the variances give identical sums. Further, the size of the third term of V ( µ) GTSMB has been shown to be much smaller that the other two (it contributed with about 0.5% of the variance). The third term could be seen as a second order correction of the first two.

Appendix C. Field and LiDAR Data Collection Methods
LiDAR data were collected using a Riegl LMS-Q680i system at five sites (SC, PANJ, ME, MN, CO) in the summer of 2014 and a sixth site (OR) in the summer of 2015 (see Figure 2 for locations and Table 1 for site descriptions). All data were collected during snow-free time periods when vegetation was fully leaf-on and non-senescent. The data was collected in north-south oriented flight lines spaced 5 km apart. Some crossing flights were flown for calibration purposes. Technical specifications for the LiDAR acquisition included: point density of 4 pulses per square meter; altitude of 732 m; field of view of 60 degrees; pulse rate of 330 kHz; nominal swath width of 812 m; horizontal and vertical accuracy (root mean square error in z) of 50 cm [33]. The LiDAR data were processed in Fusion [43] to produce rasters of various metrics with a 30 m cell size to support the plot selection process described below.
Approximately fifty field inventory plots were established at each area in the summer of 2015 [44]. To ensure that field plots represented the full range of biomass levels within forested areas of each scene, the LiDAR data was used to inform the selection of field plot locations via a stratified sampling procedure. After masking out non-forest areas within the LiDAR coverage using a LiDAR-based percent cover threshold (10%) and the National Land Cover Dataset [38], fifteen strata were delineated based upon three vegetation cover and five height classes within the random cells. A minimum of 23-24 candidate plot locations were generated for each of the 15 identified strata. Final field plots were selected from these candidate plots on the basis of accessibility and valid forest land cover status. Once in the field, crews aimed to visit 50 sample plots at each study area, with 3-4 plots in each stratum.
At each plot location, live and dead trees with dbh > 12.7 cm were measured on a 16.2-m radius plot, and trees with 2.54 cm < dbh < 12.7 cm were measured on a 4.57-m radius circular plot.
Field protocols consistent with those used by the US Forest Service were used in measuring trees [39]. In addition, survey-grade GPS coordinates (<1 m error) were acquired for each plot center.
Large-footprint LiDAR waveforms similar to those expected from GEDI were simulated from the acquired lidar data using the method presented in [32], in which waveforms are modeled as the sum of individual returns from surfaces at different heights, accounting for instrument-specific properties. Realistic noise was added to the simulated waveforms following [45,46]. The expected signal to noise ratio (SNR) of GEDI signals has been predicted through link margin analysis. For a given SNR, the probability of the ground elevation being correctly identified can be calculated and used to quantify the expected measurement accuracy. The simulator has been validated against real LVIS data (airborne large-footprint, full-waveform liDAR similar to GEDI; [47]) in terms of waveform metrics and metric accuracy in the presence of noise.