Estimation of Changes of Forest Structural Attributes at Three Different Spatial Aggregation Levels in Northern California using Multitemporal LiDAR

Accurate estimates of growth and structural changes are key for forest management tasks such as determination of optimal rotation times, optimal rotation times, site indices and for identifying areas experiencing difficulties to regenerate. Estimation of structural changes, especially for biomass, is also key to quantify greenhouse gas (GHG) emissions\sequestration. We compared two different modeling strategies to estimate changes in V, BA and B, at three different spatial aggregation levels using auxiliary information from two light detection and ranging (LiDAR) flights. The study area is Blacks Mountains Experimental Forest, a ponderosa pine dominated forest in Northern California for which two LiDAR acquisitions separated by six years were available. Analyzed strategies consisted of (1) directly modeling the observed changes as a function of the LiDAR auxiliary information (δ-modeling method) and (2) modeling V, BA and B at two different points in time, including a term to account for the temporal correlation, and then computing the changes as the difference between the predicted values of V, BA and B for time two and time one. We analyzed predictions and measures of uncertainty at three different level of aggregation (i.e., pixels, stands or compartments and the entire study area). Results showed that changes were very weakly correlated with the LiDAR auxiliary information. Both modeling alternatives provided similar results with a better performance of the δ-modeling for the entire study area; however, this method also showed some inconsistencies and seemed to be very prone to extrapolation problems. The y-modeling method, which seems to be less prone to extrapolation problems, allows obtaining more outputs that are flexible and can outperform the δ-modeling method at the stand level. The weak correlation between changes in structural attributes and LiDAR auxiliary information indicates that pixel-level maps have very large uncertainties and estimation of change clearly requires some degree of spatial aggregation; additionally, in similar environments, it might be necessary to increase the time lapse between LiDAR acquisitions to obtain reliable estimates of change.


Introduction
Light detection and ranging LiDAR data have been extensively used in forest inventories to provide auxiliary information that is highly correlated with multiple forest structural attributes [1][2][3]. This strong correlation allows estimating forest structural attributes more efficiently than if only field measurements are available [4]. In addition, the spatially explicit nature of LiDAR enables the mapping of forest attributes at fine resolutions (e.g., [2,5]). Accurate estimates of growth and structural changes are key for forest management as multiple management tasks such as determination of optimal rotation times, calculation of site indexes or the identification of areas experiencing difficulties in regeneration. Estimation of biomass is also key to quantifying greenhouse gas (GHG) emissions/sequestration, to comply with the International Panel on Climate Change (IPCC) reporting and good practice guidelines [6], and to develop a correct appraisal of forest resources for carbon markets. The extensively used area based approach (ABA) [1] provides a way to estimate forest attributes at multiple levels ranging from single pixels to large areas using LiDAR auxiliary information [7]. Availability of repeated LiDAR data acquisitions has opened the door to estimation of changes in forest structural attributes over time (e.g., [8,9]) using the ABA method.
In the ABA, the area under study is covered by a regular grid that will define a population of pixels or grid cells. In this approach, the field plots used to train models and the grid cells are of the same size, typically between 400 m 2 and 900 m 2 . A direct application of predictive models will render predictions for grid units of size too small to be considered of interest for reporting in forest inventories. Areas of interest (AOIs) (i.e., the areas for which estimates are needed) are typically geographic units that can vary in size depending on the particular application. For worldwide inventories or inventories over continents or countries, AOIs are typically administrative or political units such as countries or municipalities. In forest management applications, AOIs are typically stands, compartments or even complete forests or landscapes. All these AOIs require spatial aggregation of grid units. However, validations of predictive models in the ABA literature are typically performed using global metrics of model fit, such as the sample-based root mean square error or bias, that provide average measures of uncertainty for predictions made for pixels or plots. These measures of uncertainty derived from the model fitting stage do not directly translate into measures of uncertainty for predictions for AOIs composed of multiple pixels (i.e., countries, municipalities, forests, stands, etc.). In addition, even when considering single pixels, they are not AOI-specific, as they only provide an average value, across the entire population, of the error that can be expected using a given model.
Thus, it is clear that uncertainty measures used as quality controls in forest inventories need to be made at the AOI-level and change estimation is not an exception. For large areas holding large sample sizes, AOI-specific estimates of means or totals and their measures of uncertainty can be obtained using direct estimators (e.g., [10][11][12]) that use only use sample data from the AOI under consideration. However, if the AOI sample sizes are not large enough to support direct estimates with reliable precision, then they must be regarded as small areas [13].
Small area estimation (SAE) techniques, especially empirical best linear unbiased predictors (EBLUPs) in combination with the ABA approach have been used to obtain estimates, and their corresponding measures of uncertainty for subpopulations such as municipalities [14], groups or management units [15] and stands [4,14,16,17]. SAE techniques allow correcting the potential bias problems of synthetic predictions (i.e., predictions developed assuming that a general model developed at the population level holds for all subpopulations) and also permit reducing the large variance problems of direct estimators when AOIs sample sizes are small [13]. In addition, while EBLUPs have been extensively used in SAE contexts, they can also be used to produce estimates for subpopulations or AOIs with large sample sizes and preserve important advantages over other methods. First, they allow obtaining model-unbiased estimates and their corresponding measures of uncertainty for all AOIs using a single model that explicitly considers potential variations between AOIs. This is a clear advantage over synthetic methods that assume that a certain relation derived for the entire population holds in all AOIs. A second advantage of EBLUPs is that it is possible to reduce the modeling effort required by direct model-based or model-assisted methods where a model is needed for each AOI. It is thus clear that SAE techniques in combination with LiDAR auxiliary information have potential applications in multiple forest inventories contexts. Unfortunately, to the best of our knowledge, all studies on SAE and forest inventories have focused on estimation of structural attributes at a given point in time, and little is known about: (1) their performance when applied to forest structure change estimation, and (2) about how these techniques compare to other methods used for estimation of changes in AOIs comprising entire populations [10,12,18,19] and especially subpopulations [20].
In this study, we analyzed the two most commonly used strategies to model changes in structural forest attributes using repeated LiDAR acquisitions, and analyzed their performance when used to obtain EBLUPs for AOIs of different size. The first strategy, referred hereafter as the δ-modeling method, considers the change, δ, over the time between LiDAR acquisitions as the model response. The second strategy, which we will call y-modeling method, focuses on modeling the structural attributes y, and their derived change over time. As a novelty, in the y-modeling method, the temporal correlation of both model errors and AOI random effects were taken into account. We considered changes in three structural variables, and AOIs at three different spatial aggregation levels in order to provide insights for future applications where estimates for an entire population and for subpopulations of different sizes are needed. Variables under study are standing volume (V), above ground biomass (B) and basal area (BA) and AOIs subject to analysis are (1) an entire forested area or landscape, (2) subpopulations that in this case are forest stands and (3) pixels as gridded maps are common output in mapping applications.

Study Area
The study area is Blacks Mountains Experimental Forest (BMEF), a 3715 ha forest managed by the United States Forest Service, located northeast of Lassen National Park in northern California, USA ( Figure 1). Elevation ranges from 1700 m to 2100 m above sea level. Slopes are gentle (<10%) on the lower parts of the forest and moderate (10%-40%) at higher elevations. Climate is Mediterranean with a certain degree of continentality, with dry summers and wet and cold winters when precipitation is in the form of snow. Average precipitation is 460 mm per year with monthly average temperatures that range from −9 • C to 29 • C. Soils are developed over basalts with depths that range from 1 to 3 m. Ponderosa pine (Pinus ponderosa Lawson & C. Lawson) dominated forest occupies the majority of the area. Incense cedar (Calocedrus decurrens (Torr.) Florin), white fir (Abies concolor (Gordon & Glend) Hildebr) and Jeffrey Pine (Pinus jeffreyi Grev & Balf.) are abundant accompanying species. Forest structure is relatively open and the canopy cover varies greatly within the forest (see Figure 1). A more detailed description of the study area can be found in [21,22].

Sampling Design and Field Data
In total, 106 forested stands were delineated in BMEF. Small non-forested patches were masked in the study area and hence were not considered part of the population under study. Out of the 106 forested stands, 24 were selected and sampled in the field. Nine of the remaining 82 unsampled stands were subject to thinning during the period between two available LiDAR acquisitions (i.e., [2009][2010][2011][2012][2013][2014][2015] and all thinning operations were finished by fall 2011. These nine stands are located on the southwestern edge of BMEF and were analyzed separately because the sample of field plots used to train the LiDAR models did not include any stand subject to similar silvicultural interventions ( Figure 1). Sampled stands come from a long-term research project initiated in BMEF in 1991 and, excluding the nine thinned stands, were representative of the forest structures and forest management treatments applied in rest of BMEF.
Sampled stands were subject to six different types of treatments resulting from crossing two different factors. The first factor is the structural diversity. It has three levels referred hereafter as low structural diversity (LoD), high structural diversity (HiD) and research natural areas, RNA, or controls. Low structural diversity stands are subject to thinning operations aiming to generate simplified single-strata structures. High diversity stands are subject to thinning where all canopy layers and age groups are preserved, resulting in a multi-storied forest structure with trees of different sizes and ages. Neither the HiD stands nor the LoD stands were subject to thinnings during the period between the two available LiDAR flights. Finally, RNA stands are not subject to any thinning or harvest operation. In total, 10 LoD, 12 HiD and two RNA stands were measured in the field. The second factor under consideration was the presence or absence of prescribed forest fires. Half of the LoD, HiD and RNA stands sampled in the field had been subject to prescribed fires, but only one of the RNA stands was subject to prescribed fires during the period 2009-2015.

Sampling Design and Field Data
In total, 106 forested stands were delineated in BMEF. Small non-forested patches were masked in the study area and hence were not considered part of the population under study. Out of the 106 forested stands, 24 were selected and sampled in the field. Nine of the remaining 82 unsampled stands were subject to thinning during the period between two available LiDAR acquisitions (i.e., [2009][2010][2011][2012][2013][2014][2015] and all thinning operations were finished by fall 2011. These nine stands are located on the southwestern edge of BMEF and were analyzed separately because the sample of field plots used to train the LiDAR models did not include any stand subject to similar silvicultural interventions ( Figure 1). Sampled stands come from a long-term research project initiated in BMEF in 1991 and, excluding the nine thinned stands, were representative of the forest structures and forest management treatments applied in rest of BMEF.
Sampled stands were subject to six different types of treatments resulting from crossing two different factors. The first factor is the structural diversity. It has three levels referred hereafter as low structural diversity (LoD), high structural diversity (HiD) and research natural areas, RNA, or controls. Low structural diversity stands are subject to thinning operations aiming to generate simplified single-strata structures. High diversity stands are subject to thinning where all canopy layers and age groups are preserved, resulting in a multi-storied forest structure with trees of different sizes and ages. Neither the HiD stands nor the LoD stands were subject to thinnings during the period between the two available LiDAR flights. Finally, RNA stands are not subject to any thinning or harvest operation. In total, 10 LoD, 12 HiD and two RNA stands were measured in the field. The second factor under consideration was the presence or absence of prescribed forest fires. Half of the LoD, HiD and RNA stands sampled in the field had been subject to prescribed fires, but only one of the RNA stands was subject to prescribed fires during the period 2009-2015.
A sample of 151, 16 m radius plots (804 m 2 ) were measured in the field during the summer of 2009 and then remeasured during the summer of 2016. All field plots were located on nodes of the 100 m by 100 m grid of monumented markers at BMEF. Coordinates of the makers were determined using traverse methods and survey grade GPS observations and have an accuracy of 15 cm or better (see [23]). For each of the 26 stands selected for sampling, a node of the BMEF 100 m grid was randomly selected and used as a starting node for a 282 m by 282 m grid formed by selecting every other plot of the 100 m grid moving in the diagonal directions. Field measurements were taken on the nodes of the 282 m by 282 m grid (see Figure 1).
Within each field plot all live trees with DBH larger than 9 cm, and all dead standing trees with DBH larger than 12 cm, were stem mapped and measured for DBH and height. Plot basal area (BA) was derived directly from the field measurements. Volume (V) and above ground biomass (B) were computed as the sum of the individual tree volumes and biomasses of all standing trees. Individual tree volumes and biomasses were estimated using species-specific allometric models included in the national volume estimation library (NVEL) and in the national biomass estimation library (NBEL).
To account for the one-year difference between acquisition of field measurements in 2016, and the second LiDAR data acquisition obtained in 2015; plot-level values of the variables under analysis were computed for 2015 by linearly interpolating between the values obtained for 2009 and 2016. Finally, for each field plot we computed the change of V, B and BA on a per year basis, as the difference of the plot-level values in 2009 and 2015 divided by 6. For two plots close to the southeastern boundary of the forest, changes in V were extremely large, more than three standard deviations away from the mean value for the change in volume. These anomalous plots were removed from the analysis because such large changes seemed to be derived from edge effects. Plot-level values for 2009, 2016 and per year increments for the period, 2009-2015, for V, B and BA, in the remaining 149 plots are summarized in Table 1. For the nine unsampled stands thinned during the period 2009-2015 all thinning operations were completed by fall 2011. In total 427.40 hectares were thinned with prescriptions that varied among stands. Approximately 80% of the area was thinned from below, leaving a residual basal area of 17.22 m 2 ha −1 to 25.25 m 2 ha −1 . For the remaining 20% of the area, approximately one quarter was not thinned while the other three quarters were thinned to a residual BA that ranged from 6.89 m 2 ha −1 to 13.77 m 2 ha −1 . Fresh weight of total extractions for the 427.40 hectares subject to thinning was 11,009.38 Mg of logs and 23,164.32 Mg of chipped material.

LiDAR Data Acquisitions
Two LiDAR acquisitions are available for BMEF. The first LiDAR dataset was acquired during the summer of 2009 using a Leica ALS 50 discrete return sensor. Flying altitude was 900 m, side-lap between adjacent flight lines was at least 50% and scanning angle was ±14 • . The LiDAR data vendor generated digital terrain models (DTMs) with an accuracy of 15 cm at 95% confidence level. Additional details on the LiDAR data collection for the 2009 acquisition can be found in [24]. The same vendor in the study area performed a second LiDAR acquisition during the summer of 2015 using the same sensor, flying altitude and side-lap specifications. DTMs were also created for 2015 by the vendor.
Four sets of auxiliary variables were considered in this study. The first two sets are composed of 42 LiDAR predictors computed for each acquisition date. Set 1 will represent the predictors for 2009 and Set 2 the predictors for 2015. These predictors are descriptors of the point cloud height distributions and were all relative quantities to avoid introducing noise due to local differences in the point cloud densities of 2009 and 2015 [25]. The third set of predictors, Set 3, was computed as the differences between the 2009 and the 2015 LiDAR predictors. Finally, the fourth set of predictors, Set 4, included the incoming solar radiation computed using the Environmental Systems Research Institute (ESRI) ArcGIS Area Solar Radiation tool [26] with the 2009 digital surface model (DSM) as input; and two treatments: (1) single-or multi-story structural diversity and (2) presence or absence of prescribed fires. All predictors were computed for each field plot and for a grid with a cell size of 805 m 2 covering the entire BMEF. The cell size matched the field plot size and each cell of the grid was considered a population unit, equivalent to the field plots. Predictors and their corresponding acronyms used in further sections are summarized in Table A1.

AOIs, Target Parameter and Overview of Modelling Strategies
Two different types of subsets of population units will be repeatedly used throughout the manuscript in remaining sections. These subsets and their corresponding notation are: the sample of plots measured in the field, denoted using sub-index s and the target AOIs represented by pixels, denoted using sub-index α.
Three different groups of AOIs representing different levels of spatial aggregation were analyzed. The first group represents the largest level of spatial aggregation and represents the entire population under study. Within this group, we considered the set of all sampled stands, SS, and the entire BMEF study area after removing the nine thinned and unsampled stands, SA (i.e., sampled and unsampled but not thinned stands). The second group consists of the 106 forested stands in BMEF. In this group, we considered separately the unsampled and thinned stands (nine stands), unsampled and not thinned stands (73 stands) and sampled and not thinned stands (24 stands). Finally, the third group is the set of all pixels of the LiDAR grid covering the forested area in BMEF.
The main objective of this study was to analyze AOI-specific estimates of the change between 2009 and 2015 for three different structural variables, (V, B and BA). We will use the generic term variable of interest and the letter y to refer to the forest structural variables, and the term target parameter and Greek letter ∆ to refer to the quantities that we seek to estimate. Hereafter, target parameters will always refer to changes over time for the totals of the variables of interest in the considered AOIs, and will be expressed in a per hectare and year basis.
Considering that all pixels have the same area, the target parameter ∆ α for a generic AOI or subset of population units, α, can be expressed as: where N α is the number of population units (i.e., pixels) in the AOI. The terms y iα15 and y iα09 respectively represent the value of the variable of interest for 2009 and 2015 for the i th population unit of α, and δ iα is the change, for the i th pixel of α, in the variable of interest during the period 2009 to 2015. Finally, for comparability with previous studies, the variables of interest will be expressed in a per unit area basis, and the increments δ iα will be expressed in a per unit area and year basis. Thus, to ensure that ∆ α is expressed in the correct units, it is necessary to introduce the factors K yα and K δα . When y iα15 and y iα09 are expressed in a per unit area basis K yα = 1 6N α and for δ iα in a per unit area and year basis K δα = 1 N α . We calculated AOI estimates using two different methods. The first, δ-modeling method, for estimation of change uses models similar to those in approach A5 of Poudel et al. [8]. In this approach, the change in a structural variable at the plot/pixel-level (δ iα ) is directly modeled as a function of the LiDAR auxiliary variables available for the study area. The second, the y-modeling method, uses a modified version of approach A4 of Poudel et al. [8] to obtain AOI-specific estimates of change. Models in this approach jointly relate structural variables (y α15 and y α09 ) and LiDAR auxiliary information at a given point in time and account for the correlation between errors obtained for the same plot/pixel at different times. For both methods, variability between stands was accounted by considering them as small areas. Thus, stand-level random effects were included in the models. Models in the δ-modeling method relate the change (per year) of the variable of interest in a population unit to the auxiliary variables for the population unit. To indicate that these models consider change in the variables of interest directly, model parameters, stand-level random effects and model errors will include the subscript δ. Three different types of auxiliary variables were considered as potential predictors in the δ-modeling method. First, changes in the LiDAR auxiliary variables for the period 2009-2015, Set 3, were considered following Poudel et al. [8] as changes in LiDAR predictors are expected to correlate with growth or changes in forest attributes. Forest structure relates to growth. Thus, the LiDAR auxiliary variables for 2009, Set 1, were also considered as potential predictors that act as proxies for forest structure at the beginning of the period 2009-2015. Finally, the incoming solar radiation and the structural diversity factors and presence of prescribed fires, Set 4, were also considered as potential predictors.
For the j th population unit in the i th stand, models of the δ-modeling method have the form: where t indicate the transpose operator and x t δij is a vector of auxiliary variables in which the first element takes the value 1 for the intercept. The term β δ is a vector of model coefficients where the first element is the intercept of model (2). Selection of auxiliary variables included in the model was performed using the method described in [27]. Stand-level random effects v δi are assumed to be independently and identically distributed (i.i.d.) normal random variables v δi ∼ N(0, σ 2 δv ) for all i = 1, . . . , D, where D is the total number of stands in the study area. Model errors are i.i.d. normal random variables ε δij ∼ N(0, σ 2 δε ) independent of the stand-level random effects (i.e., Cov(ε δij v δ,k ) = 0, for all i, j and k). Models with spatially correlated errors and with non-constant error variances were initially considered but discarded in the model selection stage, as they were not found to be significant (see Section 2.5.3).
For a generic set of population units denoted by subscript ξ (which can represent either s or α), the relation in matrix notation between vector of changes of structural variables δ yξ , and the auxiliary variables included in the model (X δξ ), is expressed as: where δ ξ = (δ 1 , . . . , δ N ξ ) t , with δ k being the yearly change for the forest structural variable y, in the k th unit of ξ, and N ξ is the number of elements in the set ξ. The k th element of ξ will be an element of a given stand. To explicitly indicate this membership we will use, when necessary, the sub-indexes i th and j th to respectively indicate the stand and index of the element within the stand. The k th row of the matrix X δξ is x t δk . The vector v δ = (v δ1 , . . . , v δD ) t is a vector of stand-level random effects with variance covariance matrix G δ = σ 2 δv I D , where I D is the identity matrix of dimension D. The matrix Z ξδ is a N ξ x D incidence matrix that describes stand membership for each population unit. The r th row of Z ξδ have zeros at all positions except at position i, where i is the index of the stand to which the k th unit of ξ belongs. Finally, ε ξδ is a vector of N ξ model errors with diagonal variance covariance matrix R ξδ = σ 2 δε I N ξ . To simplify the notation, hereafter θ δ = (σ 2 δε , σ 2 δv ), will represent the vector of variance parameters. The variance covariance matrix of δ ξ is: Model (3) is a linear mixed effect model and a special case of the basic unit-level described in [28] (pp. 174).

Target Parameter δ-modeling Method
Under model (2) the target parameter (1) for a generic AOI α can be expressed as: Thus, the target parameter is a linear target parameter similar to the one considered in [4] where 1 t is a vector of ones and l t δα = 1 N α 1 t X δα , m t δα = 1 N α 1 t Z t δα and q t δα = 1 N α 1 t are vectors of known constants for the target AOI α.

Model Selection and Estimator δ-modeling Method
The target parameter ∆ α was estimated for all considered AOIs using∆ α the empirical best linear unbiased predictor (EBLUP) described in [29]. For each variable of interest, auxiliary variables included in X δα were preselected using the best subset selection procedure described in [29] (pp. 179-180). When models with similar values of model root mean square error or coefficients of determination were compared, the preferred option was to select the model with smallest values of σ 2 δv . This criterion is appropriate to minimize the leading term of the AOI specific mean square errors [29] (pp. 176). Pre-selected models considered constant model error variances and no spatial correlation of model errors were fitted using maximum likelihood (ML). In a subsequent stage, models were re-fitted using ML including: (1) an exponential spatial correlation model for the model errors and (2) a non-constant error variance where ε δij ∼ N(0, σ 2 δε k 2w δ i j ). The term k i j is the value of the predictor included in the model most correlated to δ and w δ is an additional parameter to account for heteroscedasticity. For all variables, no clear patterns of spatial correlation or non-constant variances were observed, which supports the model form described in Section 2.5.1.

MSE Estimators for the δ-modeling Method
For all AOIs, the mean squared error of the EBLUP was estimated using the estimator provided by [32] and extended in [4] to account for the fact that AOIs can contain a small number of population units. This estimator is the sum of three components where the last one, 2g 3,α (θ δ ), is a bias correction factor:M The first term of (9) equals: The second term of (9) is: } accounts for the uncertainty due to the estimation of the random effects while g 2δα (θ δ ) accounts for the uncertainty due to estimating β δ . For model (2), it is possible to compute a bias correction factor for the mean square error estimator, that accounts for the uncertainty due to estimating θ δ . This correction factor equals: where, b t δα = m t δα G(θ δ )Z t δs V δs (θ δ ) −1 and H δs (θ) is the Fisher information matrix for the fitted model. Explicit formulas for g 3δα (θ δ ) are provided [29] (pp. 179-180). This bias correction factor was used as a reference in comparisons with the y-modeling method.
All estimators of the mean square errors for AOIs in unsampled stands were made assuming that the model fitted for the sampled stands holds in the unsampled stands. Under this assumption, the leading term g 1δα (θ δ ), ofM SE{∆ δα (θ δ )} will be larger than if the stand containing the AOI was sampled. This occurs because the negative term m t δαĜ (θ δ )Z t δsV δs (θ δ ) −1 Z δαĜ (θ δ )m δα makes the term g 1δα (θ) smaller as the stand sample size increases. For given time t, the variable of interest in the j th population unit in the i th stand is expressed as: where x t yijt is a vector of auxiliary variables, specific for time t, in which the first element takes the value 1. The term β yt is a vector of time-specific coefficients with the first element representing the model intercept. The random components of model (13) are the stand-level random effects u yit and the model errors e yi jt . To account for heteroscedasticity, model errors e yi jt were of the form e yi jt = ε yi jt k ω yt i jt with ε yijt ∼ N(0, σ 2 yεt ); the term k i jt , the predictor included in the model for time t, is most correlated to y t , and ω yt is a parameter to model the change in the error variance. The stand-level random effects u yit were assumed to be independently and identically distributed (i.i.d.) normal random variables where D is the total number of stands in the study area. Model errors were assumed independent of the stand-level random effects (i.e., Cov(e yi jt , u ykt ) = 0, for all i, j and k). Finally, model errors were consider independent with Cov(ε yi jt , ε yklt ) = 0 if i k or j l for both t = 2009 and t = 2015. Models with spatially correlated errors were initially considered, but discarded for both years in the selection stage as no clear spatial correlation patterns were observed in the residuals. Auxiliary variables included in the model were selected following the same procedure used in the δ-modeling method, using the best subsets selection procedure described in [27].
To account for expected correlations, models for 2009 and 2015 were combined into a single model where stand-level random effects and model errors for 2009 and 2015 were allowed to be time correlated. Then for the j th population unit in the i th stand the two-dimensional vector y i j = (y i j09 , y i j15 ) t of variables of interest was related to the auxiliary variables through the following model: with: where 0 t p2009 and 0 t p2015 are, respectively, row vectors of zeroes of dimensions equal to x t yi j09 and x t yi j2015 . As with the time-specific models, to account for heteroscedasticity in the combined model, model errors e yijt were of the form e yi jt = ε yi jt k ω yt i jt with ε yi jt ∼ N(0, σ 2 yεt ). The parameters ω yt were updated when fitting the combined model. Spatial correlation of model errors was not found to be significant when considering each year separately, therefore, no spatial correlation patterns were considered in the combined model. The only source of correlation of model errors present in the combined model was temporal correlation. For a given location, the variables ε yi jt yi j09 ∼ N(0, σ 2 yε09 ) and ε yi j15 ∼ N(0, σ 2 yε15 ) were allowed to be correlated random variables. The correlation between ε yi j2009 and ε yi j2015 is ρ ε and the variance-covariance matrix of e yi j is: To model correlation between stand-level random effects, three random components v yi , v yi2009 and v yi2015 , independent of each other, were considered. These random components had distributions v yi ∼ N(0, σ 2 yv ), v yi09 ∼ N(0, σ 2 yv09 ) and v yi15 ∼ N(0, σ 2 yv15 ). Stand-level random effect for a given point, at time t, u yit , are the sum of a pure stand effect, independent of time t, v yi , and a time-specific stand represents these sums.
The variance covariance matrix of v yi is diagonal, therefore, the variance covariance matrix of u yi is: The fact that the random effect v yi is present for both 2009 and 2015 results in a positive correlation of the terms of u yi , with a correlation coefficient . In a last step, models with a simpler structure of random effects were fitted and compared to the original models using a likelihood ratio test. Simplified models contained only random effects v yi that did not depend on time (i.e., models did not contain time-specific random effects v yi09 and v yi15 ). For simplified models u yi09 = u yi15 = v yi . For a generic set of population units ξ, the combined model can be expressed in matrix notation as: where y ξ , e yξ , and X yξ , are obtained stacking the vectors y i j , e yi j , or the matrices X i j of all units in ξ. As no spatial correlation patterns were found, the variance covariance matrix of e yξ is, for the simplified models and 2N ξ x3D for the models with time-specific random effects. The variance covariance matrix of y ξ can be expressed as: In Equation (19), it is explicitly indicated that matrices V yξ (θ y ), G y (θ y ) and R yξ (θ y ) depend on the vector of variance-covariance parameters θ y = (σ 2 yv , σ 2 yv09 , σ 2 yv15 , σ 2 yε09 , σ 2 yε15 , ρ ε ) t . For the models with simplified random effects the vector of variance covariance parameters reduces to θ y = (σ 2 yv , σ 2 yε09 , σ 2 yε15 , ρ ε ) t . Model (18) (1) for a generic AOI, α is a linear combination of the form: where l t yα = q t yα X yα , m t yα = q t yα Z yα and q t yα are vectors of known constants for the target AOI α, with q t yα a vector of dimension 2N α where the k th element equals It is important to remark that for models with a simplified structure of stand random effects, the target parameters do not depend on u y . For these models, y i j15 − y i j15 = (x t yi j15 β y15 − x t yi j09 β y09 ) + (e yi j15 − e yi j09 ), and u yi09 − u yi15 = v yi − v yi = 0. For these type of models, one can expect significant gains in accuracy because it is not necessary to estimate random effects.  (18) is a linear mixed effects model with block diagonal structure and ∆ α a linear model parameter; thus, after [29] (pp. 108-110), the EBLUP∆ y (θ y ) of ∆ α is: whereβ y (θ y ) equals:β MatricesV ys (θ y ),Ĝ y (θ y ) andR ys (θ y ) are obtained by replacing the estimated variance parameters θ y in V yξ (θ y ), G y (θ y ) and R yξ (θ y ), by their REML estimatesθ y . EBLUPs∆ α are also functions ofγ and are obtained using formula (7), wherev δ (θ δ ) equals: As with the δ-modeling method, estimates for AOIs in unsampled stands were made assuming that the model fit for the sampled stands also applied in the unsampled stands, which leads to m t yαvy (θ y ) = 0 and∆ yα (θ y ) = l t yαβy (θ y ) is a synthetic predictor. For all AOIs, the estimator of the mean square error of the EBLUP under the y-modeling method, The terms g 1yα (θ y ) and g 2yα (θ y ) in (24) are analogous to those in (10) and (11) and have similar interpretation. To compute g 1yα (θ y ) and g 2yα (θ y ), matricesĜ δs (θ δ ),R δs (θ δ ),V δs (θ δ ) andR δα (θ δ ) must be replaced byĜ ys (θ y ),R ys (θ y ),V ys (θ y ) andR yα (θ y ). For the y-modeling method we did not compute the second-order correction factors.

Comparison of Methods
Methods were compared using three different criteria. First, we used general measures of accuracy providing the average error or uncertainty of prediction at the pixel-level (2.7.1); then, we compared methods using AOI-specific estimates and measures of uncertainty (2.7.2). Finally, we assessed the risk of generating biased predictions when using the δ-modeling method and y-modeling method in unsampled stands (Section 2.7.3).

General Accuracy Assessment
To compare the δ-modeling method and y-modeling method, a fist assessment was made using the cross-validated model mean squared error, mRMSE, and the model bias mBias: where δ ij is the observed value of change for the j th plot included in the i th sampled stand andδ i j is the predicted value for that plot when model coefficients are obtained removing that plot from the training dataset. For the y-modeling methodδ i j is obtained using the observed and fitted values of the variable of interest, asδ i j = 1 6 (ŷ i j15 −ŷ i j09 ) whereŷ i j09 andŷ i j15 are the predictions of y i j09 and y ij15 are obtained fitting the corresponding y-model without the observations for plot i j. In addition, we computed mRMSE and mBias in terms relative to the average changes observed in the sampled plots. These quantities are denoted as mRRMSE = mRMSE/∆ f and mRBias = mBias/∆ f where∆ f is the mean of the changes observed in the field plots.

AOI-specific Comparisons.
For each of the considered areas of interest an estimate by each method (i.e., EBLUPs using either ∆ δα (θ y ) or∆ yα (θ y )) and their corresponding mean square error estimators (i.e.,M SE{∆ δα (θ y )} or MSE{∆ yα (θ y )}) were available. First, for each AOI and method, we directly compared∆ δα (θ y ) and ∆ yα (θ y ), and the square roots ofM SE{∆ δα (θ δ ) andM SE{∆ yα (θ y ). To simplify the notation, we will omit the subscript indicating the target AOI unless it is necessary and refer to∆ δ as∆ y . Similarly, after omitting the subscript α, the AOI specific root mean square errors will be denoted as: To perform an assessment relative to the predicted values the following coefficient of variation: was computed for each AOI and method. Finally, for each AOI we compared CV{∆ δα (θ y )} to CV{∆ yα (θ y )}. To simplify the notation we will refer to these coefficients of variation as CV δ and CV y . Finally, for each sampled AOI we computed, using only the field information, the sample mean∆ f α and its standard error: and its coefficient of variation CV f α . In Equation (31), n α is the number of field plots in the considered AOI and the sub-index f is used to indicate that these quantities are calculated using only field data. Again, to simplify the notation we removed the sub-indexes α unless they were necessary. Finally, the sample mean and the coefficient of variation were then compared to their counterparts (7) and (29) and (21) and (30) obtained by the δ-modeling method and y-modeling method, respectively.

Extrapolation to Thinned Stands
The fact that thinned stands were not represented in the sample of field plots raises the question of how applicable the models obtained are using either the δ-modeling method or the y-modeling method to these stands. Applying the models to these stands involves a degree of extrapolation to a different population and a high risk of producing biased predictions. We assessed this risk by comparing the distributions of the LiDAR predictors included in the models for the δ-modeling method and the y-modeling method for the sample of field plots to the distributions of the predictors in: (1) the sampled stands, (2) the unsampled and not thinned stands and (3) unsampled and thinned stands. Within each group (i.e., field plots, FP; sampled stands not thinned, SS; unsampled stands not thinned, UN; and unsampled stands subject to thinning, UT), we estimated density functions for each LiDAR predictor using a Gaussian kernel and a bandwidth determined using Silverman's rule [33]. Note that we considered two AOIs for the largest level of aggregation, the first one is SS and the other one, SA, is the union of SS and UN. We first considered each predictor separately and graphically compared their density functions. Predictors for 2009 and 2015 in the y-modeling method were considered separately. For each predictor and group, we computed the area of overlap, AO, with the density function for the field plots which takes value 0 if there is no overlap and value 1 if the distribution of the predictor in the considered group equals the distribution for the sample.
In addition to the area of overlap and aiming to consider all predictors in a given model at once, we calculated NT2, the average of Mesgaran's novelty index NT2 for each model and group [34]. This quantity provides the average Mahalanobis distance from the pixels of the group to the mean of the sample of field plots, and it is expressed in terms relative to the maximum Mahalanobis distance observed in the sample. Values of NT2 above one indicate that on average pixels in a group are at a distance to the mean of the field plots larger than the distance from the extreme field observation to the mean of the field plots. We also calculated NT2 mean , the average of NT2, but using the mean Mahalanobis distance as normalizing constant instead of the maximum. The reference value of one for NT2 mean indicates that the average Mahalanobis distance from pixels, to the mean of the field plots, is the same as the average of the Mahalanobis distances observed in the sample. Means and variance covariance matrices for computation of Mahalanobis distances are always estimated using the sample of field plots.

Selected Models δ-modeling Method and y-modeling Method
Selected models for the δ-modeling method included auxiliary variables from Set 1 and Set 3. It was possible to find alternative models including fixed effects for the diversity treatments (i.e., predictors from Set 4) with similar values of mRMSE and mRBias; however, those models did not improve the model fit. From a practical point of view, models that only depend on the LiDAR variables but do not depend on the structural diversity treatments or the presence/absence of prescribed fires make the models more portable and applicable to stands without needing to know exactly which one of these treatments was applied. Considering that models using the structural diversity and presence of prescribed fires as predictors did not result in important gains in accuracy, we selected models that were not dependent on these treatments ( Table 2). For the models in the δ-modeling method, the variance of random the effects,σ 2 δv , was very small compared to the variance of the model errors,σ 2 δε , ( Table 2). This indicated that, in this forest and for these variables, the use of synthetic estimators that do not account for the variability between stands should not cause a strong bias problem.
Models for the y-modeling method showed a pattern similar to that observed for the δ-modeling and only included predictors from Set 1 and Set 2 ( Table 3). Errors showed non-constant variance patterns for all variables. The predictor most correlated with the variable of interest (i.e., the predictor used to model the error variance) was the same for 2009 and 2015. For V and B, the variance of model errors was a function of the square of the mean LiDAR elevation (Elev_mean 2 ), and the exponents of the error variance function were very close to those obtained in [4,17,27] for V, and in [27] for B. For BA, variance of model errors was a function of the percentage of first returns above two meters (PcFstAbv2). Based on the results of the likelihood ratio tests, that for all variables resulted in p-values larger than 0.87, simplified models were selected and used for prediction. Table 3. Summary models for the y-modeling method. Model coefficients, standard errors of the model coefficients, variance-covariance parameters and general metrics for accuracy assessment are provided. Covariate acronyms are explained in Table A1. Coef is the value of the coefficient and Std.Error its corresponding standard error. Coef is the value of the coefficient and Std.Error its corresponding standard error. V indicates volume, BA indicates basal area and B indicates biomass.

Model
Year

General Accuracy Assessment and Comparison of Methods
For all variables and modeling alternatives, values of mBias and mRBias were orders of magnitude smaller than mRMSE and mRRMSE (Tables 2 and 3). For all variables and methods, the percentages of explained variance for the change in V, BA and B were low. For the δ-modeling method, models explained 34.38%, 31.37% and 39.04% of the variance of the change in V, BA and B, respectively. For the y-modeling method, models explained only 10.65% and 5.37% of V and B, respectively, while for BA the prediction using the y-modeling method was not better than the sample mean. In addition, δ-models had values of mRBias lower than those obtained for the y-models. When instead of the change we considered the forest structural attributes with the y-modeling method, percentages of explained variance were 82.16% for V, 82.53% for BA and 82.93% for B. Considering only 2009, the percentage of explained variance for V, BA and B was 81.60%, 83.45%, and 82.84%, respectively. Considering only 2015, the percentage of explained variance for V, BA and B was 82.72%, 81.42% and 82.98%, respectively.

Entire Study Area
Estimates for the sampled stands and for the whole study area using either the δ-modeling method or the y-modeling method were consistent with the estimates obtained using only the field information except for BA and B in SA. For the entire study area values of RMSE δ tended to be smaller than RMSE y . When considering the sampled stands, SS, approximate confidence intervals computed as∆ f ± 2SE f for the field estimates, and as∆ δ ± 2RMSE δ and∆ y ± 2RMSE y for each one of the LiDAR based methods, overlapped for all variables (Table 4) and contained estimates derived from other methods. Differences between the uncertainty of estimates obtained from LiDAR-based methods and the uncertainty of estimates obtained from field-based methods tend to be of small magnitude. Table 4. Average increments of volume V, basal area BA and biomass B in the entire study area excluding the thinned stands (SA) and for the union of the sampled stands (SS). Estimates (∆), root mean square errors (RMSE), coefficients of variation (CV) and confidence intervals (CI) obtained using the δ-modeling method and the y-modeling method are compared to estimates (∆ f ), standard errors (SE f ) coefficients of variation (CV f ), and confidence intervals (CI f ) using only the field information.

Stands
Estimated change of V, BA and B in the sampled stands by both the δ-modeling method and the y-modeling method agreed with their field-based counterparts in most stands ( Figure 2). However, the width of the confidence intervals obtained using the δ-modeling method tended to be larger than the confidence intervals of the estimates derived using the y-modeling method (Figure 2). For unsampled stands, estimates and confidence intervals had larger variability in stands where the forested area was small (Figure 3). This variability cannot be avoided, and indicates that certain sources of errors cannot be compensated if the number of pixels that are aggregated is low. Finally, for both methods, values of and were in the range of 0.25 to 1 m 3 ha −1 year −1 for V, of 0.02 to 0.15 m 2 ha −1 year −1 for BA and of 0.10 to 0.80 Mg ha −1 year −1 for B. However, for B and V, the tended to be smaller than while negligible differences between methods were observed for BA (Figure 3 and Figure 4). For unsampled stands, estimates and confidence intervals had larger variability in stands where the forested area was small (Figure 3). This variability cannot be avoided, and indicates that certain sources of errors cannot be compensated if the number of pixels that are aggregated is low. Finally, for both methods, values of RMSE δ and RMSE y were in the range of 0.25 to 1 m 3 ha −1 year −1 for V, of 0.02 to 0.15 m 2 ha −1 year −1 for BA and of 0.10 to 0.80 Mg ha −1 year −1 for B. However, for B and V, the RMSE y tended to be smaller than RMSE δ while negligible differences between methods were observed for BA (Figures 3 and 4). For the thinned (and unsampled) stands, differences between -modeling method and the -modeling method for BA were large and their confidence intervals did not overlap (Figure 3). For these stands, the estimates for BA using the -modeling method tended to indicate almost no changes in BA. Estimates for the thinned stands using the -modeling method provided inconsistent results indicating gains in B, and changes close to zero for V and BA. Certain inconsistencies were also observed for stands subject to thinning when using the -modeling method where predictions of the change in V and B were positive for three and five stands respectively. These inconsistencies seem to derive from the fact that the distribution of predictors in Set 3 (i.e., changes in LiDAR predictors) in the thinned stands was rather different to the distribution of these predictors in the sample of field plots, in the sampled stands and in the unsampled and not thinned stands. For predictors of the -modeling method, modeled differences between thinned stands and the remaining groups were of much smaller magnitude. Results for the analysis of the extrapolation risks are presented in detail in section 0. For the thinned (and unsampled) stands, differences between δ-modeling method and the y-modeling method for BA were large and their confidence intervals did not overlap (Figure 3). For these stands, the estimates for BA using the δ-modeling method tended to indicate almost no changes in BA. Estimates for the thinned stands using the δ-modeling method provided inconsistent results indicating gains in B, and changes close to zero for V and BA. Certain inconsistencies were also observed for stands subject to thinning when using the y-modeling method where predictions of the change in V and B were positive for three and five stands respectively. These inconsistencies seem to derive from the fact that the distribution of predictors in Set 3 (i.e., changes in LiDAR predictors) in the thinned stands was rather different to the distribution of these predictors in the sample of field plots, in the sampled stands and in the unsampled and not thinned stands. For predictors of the y-modeling method, modeled differences between thinned stands and the remaining groups were of much smaller magnitude. Results for the analysis of the extrapolation risks are presented in detail in section 0.  (Table 5 and Figure B1). Predictions from the -modeling method tend to be smoother than predictions from the -modeling method. For all variables, the proportion of pixel-level predictions using the -modeling method within the range of values observed for the field plots, was always 99.84% or larger ( Figure B2). Considering that these results were obtained in the presence of thinned stands and the relatively small fraction of the forest that was sampled, obtaining less than 0.16% of the predictions outside of the measurement range seems to be a clear sign of over smoothing (see Appendix B).  (Table 5 and Figure A1). Predictions from the δ-modeling method tend to be smoother than predictions from the y-modeling method. For all variables, the proportion of pixel-level predictions using the δ-modeling method within the range of values observed for the field plots, was always 99.84% or larger ( Figure A2). Considering that these results were obtained in the presence of thinned stands and the relatively small fraction of the forest that was sampled, obtaining less than 0.16% of the predictions outside of the measurement range seems to be a clear sign of over smoothing (see Appendix B).

Extrapolation to Thinned Stands
Estimates of change in B for the thinned stands by both methods were clearly subject to bias problems. The predicted change in B for the total area subject to thinning for the period 2009-2015, using the δ-modeling method was an increase in biomass of 40,469.22 Mg. The predicted change using the y-modeling method was a removal of B. However, the predicted removal for the period 2009-2015 was only 1750.29 Mg while the weighted extractions for the thinned stands were orders of magnitude larger. For BA both methods estimated extractions in BA, which is consistent with the fact that these stands were thinned. Estimated changes in BA using the δ-modeling method for the thinned stands ranged from −0.05 m 2 ha −1 to −1.88 m 2 ha −1 , which seems to be a very small change in basal area. Estimated changes in BA using the y-modeling method ranged from −3.58 m 2 ha −1 to −7.01 m 2 ha −1 . An advantage of the y-modeling method is that it allows obtaining the values of the structural attributes at a given point in time. Using the y-model we estimated BA for the thinned stands for 2015. For those stands where thinning prescriptions dictated leaving a residual BA of 17.22 m 2 ha −1 to 25.25 m 2 ha −1 , estimated BA for 2015 ranged from 19.87 m 2 ha −1 to 26.22 m 2 ha −1 , which is in accordance with the thinning prescriptions. For the remaining area subject to thinning the estimated BA for 2015 was 17.64 m2 ha −1 , while the prescriptions dictated leaving a residual BA ranging from 6.89 m 2 ha −1 to 13.77 m 2 ha −1 in 75% of the area and leaving the remaining area untouched. In general, the estimated BA for 2015 are consistent with the prescriptions, which indicates that the y-modeling method produces reasonable estimates of BA when extrapolating to the thinned stands. In summary, for the estimation of changes, biases derived from extrapolation seemed to be of larger magnitude for the δ-modeling method although they were also present for the y-modeling method.
The extrapolation indexes NT2 and NT2 mean showed that predictions in thinned stands, involved a large amount of extrapolation when using the δ-modeling method. For the y-modeling method, differences between thinned stands and stands not subject to thinning were of smaller magnitude (Figures 5 and A3). The inspection of the distribution of the LiDAR predictors in the field plots, the sampled and not thinned stands, unsampled and not thinned stands and the unsampled and thinned stands showed similar results for all variables, being the distributions of predictors from Set 3 (i.e., changes in LiDAR predictors for the period 2009-2015) very sensitive to the thinning operations ( Figures 6, A4 and A5).   [35], NT2, for the sampled and not thinned stands (dark blue), unsampled stands not thinned (green) and unsampled and thinned stands (red). The value of this index for the field plots (light blue) provides the baseline value (i.e., the value observed for the sample of field plots).
Remote Sens. 2019, 11, x FOR PEER REVIEW 23 of 34 Figure 6. Comparison of density functions for the predictors in the models used to estimate changes in Volume using the -modeling method and -modeling method in field plots (light blue), sampled and not thinned stands (dark blue), unsampled and not thinned stands (light blue) and unsampled and thinned stands (red). For each group the area of overlap, AO, with the density function for the field plots (green) is provided for each predictor.

General Accuracy Assessment and Comparison of Methods.
The smallest values of were obtained using the -method, which is consistent with previous results reported by Poudel et al. [8] for V and B in coastal coniferous forest of Western Oregon and by Temesgen et al. [9] for B in spruce-dominated forest of Alaska (Table 2 and Table 3).

Figure 6.
Comparison of density functions for the predictors in the models used to estimate changes in Volume using the δ-modeling method and y-modeling method in field plots (light blue), sampled and not thinned stands (dark blue), unsampled and not thinned stands (light blue) and unsampled and thinned stands (red). For each group the area of overlap, AO, with the density function for the field plots (green) is provided for each predictor.

General Accuracy Assessment and Comparison of Methods.
The smallest values of mRMSE were obtained using the δ-method, which is consistent with previous results reported by Poudel et al. [8] for V and B in coastal coniferous forest of Western Oregon and by Temesgen et al. [9] for B in spruce-dominated forest of Alaska (Tables 2 and 3). We observed, however, smaller differences between methods. Additionally, as observed in previous studies, (e.g., [8,21,35,36]) where LiDAR auxiliary variables showed a much stronger correlation with structural attributes at a given point in time than with their change.
Values of mRMSE for V were 3.47 m 3 ha −1 year −1 when using the δ-method and 3.76 m 3 ha −1 year −1 when using the y-method. These values are slightly smaller than the mRMSE obtained by Poudel et al. [8] using the δ-method  [20] in coniferous forest in Norway, using the y-method with log-transformed models and two LiDAR acquisitions that were two years apart from each other. In relative terms, for V and B, the values that we obtained for mRRMSE were considerably larger than those obtained by Poudel et al. [8]. These differences are due to the fact that observed growth rates in Poudel et al [8] are much higher than we observed at BMEF.

Entire Study Area
Most studies on estimation of change of structural variables using repeated LiDAR measurements have focused on analyzing indexes of model fit and reported only global measures of accuracy developed at a plot level. There is an important difference between the values of RMSE δ and RMSE y and mRMSE δ and mRMSE y being mRMSE δ and mRMSE y an order of magnitude larger than RMSE δ and RMSE y . Model root mean square errors mRMSE δ and mRMSE y provide an average measure of the errors that can occur when predicting a single pixel. For large areas, there will be some level of compensation of overpredicted and underpredicted pixels. Knowing how important that compensation is requires calculating AOI-specific root mean square errors. These AOI-specific measures cannot be directly derived from mRMSE because RMSE δ and RMSE y consider factors such as the uncertainty in the estimation of the fixed and random effects that are not accounted for in mRMSE δ or mRMSE y . The effect of these factors in RMSE δ and RMSE y can cause that the way two models with similar values of mRMSE rank based on this metric could change when attending to RMSE. But the most important consequence of the disconnect between mRMSE δ and mRMSE y and AOI-specific measures of uncertainty, is that the former cannot be used as quality controls in LiDAR based inventories.
While numerous studies on estimation of changes using LiDAR rely on global measures of accuracy such as mRMSE, exceptions to this trend can be found in the literature [10,12,[18][19][20]. The last four studies used model assisted techniques to derive either landscape or stratum level changes. Reported errors in those studies changed depending on the modeling techniques and study areas, but they all were of similar magnitude for changes in biomass per hectare and year (Table 4). Errors for the methods tested in this study were smaller than those reported by [10], where changes in live carbon stocks in Norway were estimated using generalized regression estimators (GREG). Differences with the errors reported in [10] for carbon, using the same 0.5 biomass to carbon conversion factor, were in the range of 0.12-0.09 Mg ha −1 year −1 . These differences seem to be due to multiple factors such as differences between study areas, changes in live biomass versus changes in standing biomass, time between LiDAR acquisitions and field plot sizes etc. Further investigation is needed to test if the model-based estimators studied here and the GREG estimators in [10] have a similar performance when used under the same conditions.
The study from Magnussen et al. [18] also included model based estimators using the y-modeling method. Reported errors were slightly larger than the ones observed here but at the same time smaller in terms relative to the observed mean change. An important result from the comparisons of [18] was the drastic improvement in model accuracy when developing stratum specific models (i.e., a set of model coefficients per stratum) as opposed to a global model for the whole study area. The mixed effect models used in this study can be used in combination with stratification if sample sizes are large. The introduction of stand level random effects allows for certain variability between AOIs that can be applied in situations where AOI sample sizes are limited.

Stands
One of the novelties of this study was the analysis of estimates for AOIs with small sample sizes to develop AOI specific models (i.e., stands). While at large scales both LiDAR-based and field-based estimates were very similar and had equivalent accuracies, at the stand-level, LiDAR based estimates, clearly had smaller errors than their field-based counterparts do. Qualitatively, this result for the change in V, BA and B is similar to the results obtained in [15,17] for the structural variables themselves and shows that the LiDAR auxiliary information allows for gains in efficiency when estimating changes in AOIs with small sample sizes. However, due to the low correlation of LiDAR and structural changes, values of CV δ and CV y were oftentimes larger than 50%. These values of CV are larger than those observed for structural variables in similar AOIs in previous studies [4,14,17]. While differences were not of large magnitude RMSE y , tended to be smaller than RMSE δ . In addition, RMSE y had a larger variability because errors did not have constant variance. Finally, stand level estimates using the δ-modeling method in the thinned and unsampled stands involved an important degree of extrapolation that can cause inconsistent estimates and severe biases, which indicates that the δ-modeling method is more sensitive to extrapolating than the y-modeling method.

Pixel-level
For the most detailed level of disaggregation, the magnitude of the errors was very large. This is due to the low correlation between LiDAR auxiliary variables and the change in structural attributes. First-order and second-order texture indexes [37] are auxiliary variables with a promising potential for future research aiming to improve the prediction of structural changes. While for structural variables, maps at the pixel-level can provide a reliable reference about the forest structure; for growth and changes, pixel-level maps like the one in Figure A1 should be taken as mere approximations. They could be used to infer certain trends and patterns, but the high values of RMSE δ and RMSE y show that estimates for a particular location made at the pixel scale can differ significantly with reality. These results clearly indicate that, predictions at such a fine scale are highly unreliable, and it is necessary either to perform some level of spatial aggregation or to increase the lapse between LiDAR acquisitions.

Advantages of Modeling Alternatives
In general, the δ-modeling method was found to be a better alternative to estimate changes for the entire study area than the y-modeling method; however, the y-modeling method produced better results at the stand-level and also seemed to be advantageous to prevent problems related to extrapolation to values of the covariates outside of those included in model development.
The δ-modeling method offers a faster model developments and fitting, and is significantly simpler than the modeling with the y-modeling method, as it is not necessary to consider differences between years and time correlations. The main disadvantage of this method is that it seems to be more prone to extrapolation errors. Predictors from Set 3 are sensitive to intense changes in the forest structure caused for example by thinning (see Figures 5, 6, A4 and A5). The inspection of predictors of alternative models for V and B using this method revealed that inconsistencies of predictions in unthinned stands could be attenuated including more predictors from Set 1. The sensitivity to changes of predictors from Set 3 can be an advantage if all possible changes are correctly represented in the field sample. However, for relatively short periods of time between acquisitions and a low amount of forest operations, changes that are not very frequent in the landscape can be misrepresented or even not included in the sample. Thus, results for areas subject to those changes can be severely biased and inconsistent. The more complex model development for the y-modeling method may be compensated by its ability to produce a richer set of outputs, by its apparently smaller risk of extrapolation and by its more accurate estimates for AOIs with small sample sizes (i.e., stands). In this study we analyzed the performance of the y-modeling method when estimating change, but estimates of V, BA and B for all the AOIs in 2009 and 2015 could have been readily obtained using this method. Results from our study also support the idea that the y-modeling method has advantages over the δ-modeling method in terms of protection against inconsistent extrapolations. The distributions of predictors from Set 1 and Set 2 in thinned stands were relatively similar to the distributions observed for the sample while the distributions of predictors in Set 3 used in the δ-modeling method, these distributions were rather different (see Figures 5,6, A4 and A5). The greater similarity between thinned stands and the sample of field plots, for predictors from Set 1 and Set 2, indicates that the effect of thinning, in terms of auxiliary information, can be seen as transition from one situation in 2009 to another in 2015, and both seem to be represented in the field sample (e.g., Figure 5). If structures before and after the thinning (or other changes) are represented in the sample, the need for extrapolation will be limited. Within certain limits, if the sampling design covers all structures present at both points in time, even if there is a particular change from one structure to another that is not represented in the sample, predictions from the y-modeling method will not involve large extrapolations.

Conclusions
The four main conclusions obtained from this study include:

•
The change of structural attributes and LiDAR auxiliary information are weakly correlated. This weak correlation seems to more evident in BMEF than in previous studies because of the slower growth in the study area and the relatively short lapse of time between LiDAR acquisitions, which indicates that for future studies in similar areas it might be necessary to increase the time lags between LiDAR flights. • In general, the δ-modeling method was found to be a slightly more accurate alternative to obtain estimates of change for the whole study area; however, the y-modeling method was able to produce better estimates at the stand level. In addition, the y-modeling method method also seemed to be less prone to extrapolation problems. This indicates that field campaigns for the δ-modeling method have to be carefully designed while the y-modeling method might be less sensitive to certain bias problems.

•
Despite the weak correlations with the changes in structural attributes, LiDAR auxiliary information allows obtaining estimates of growth for stands that improve over those derived using only field information.

•
The large uncertainty observed for pixel-level predictions indicated that high-resolution maps of growth, generated using LiDAR auxiliary information in similar conditions, should be taken as approximated products. Funding: This research received no external funding.

Acknowledgments:
We would like to thank the USDA Forest Service, Region 5 and Lassen National Forest for assistance in project implementation, and the Guest Editor and two anonymous Reviewers for their constructive comments.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix B
Predictions from the δ-modeling method tend to be smoother than predictions from the y-modeling method ( Figure A1). For all variables, the proportion of pixel-level predictions using the δ-modeling method within the range of values observed for the field plots, was always 99.84% or larger. Considering the presence of thinned stands and the relatively small fraction of the forest that is sampled. Obtaining less than 0.15% of the predictions outside of the measurement range seems to be a clear sign of over smoothing. Predictions using the y-modeling method showed a greater variability, especially for BA, and the proportions of predictions inside the range of observed values, P y , were 99.45% for V, 95.82% for BA and 99.29% for B. For BA, pixel-level predictions using the y-modeling method were oftentimes negative and of larger magnitude than the changes in BA observed for the plots. However, these pixels represent a small proportion of the total predictions (i.e., 4.02%), and a significant portion of them correspond to pixels in thinned stands. It is important to note that these comparisons of predicted values inform about how similar predictions are by the two analyzed methods and cannot be considered as indicators of accuracy or reliability. For all variables, pixel-level predictions by both methods were strongly correlated with Pearson's correlation coefficients of 0.92, 0.82 and 072 for V, BA and B, respectively ( Figure A2). Finally, considering the unsampled and thinned stands, pixel-level predictions obtained by both methods showed the same inconsistencies observed at the stand-level especially for B using the δ-modeling method where only about 4%, of the predicted values were negative (i.e., removals of B). These inconsistencies are clearly due to extrapolations in the thinned stands and are analyzed in more detail in next section.   Appendix C Figure C1. Indexes of extrapolation. Average of Mesgaran's novelty index relative to the mean, 2 , for the sampled and not thinned stands (dark blue), unsampled stands not thinned (green) and unsampled and thinned stands (red). The value of this index for the field plots (light blue) provides the baseline value (i.e., the value observed for the sample of field plots). Figure A3. Indexes of extrapolation. Average of Mesgaran's novelty index relative to the mean, NT2 mean , for the sampled and not thinned stands (dark blue), unsampled stands not thinned (green) and unsampled and thinned stands (red). The value of this index for the field plots (light blue) provides the baseline value (i.e., the value observed for the sample of field plots). Figure C2. Comparison of density functions for the predictors in the models used to estimate changes in Basal Area using the -modeling method and -modeling method in field plots (light blue), sampled and not thinned stands (dark blue), unsampled and not thinned stands (light blue) and unsampled and thinned stands (red). For each group the area of overlap, AO, with the density function for the field plots (green) is provided for each predictor. Figure A4. Comparison of density functions for the predictors in the models used to estimate changes in Basal Area using the δ-modeling method and y-modeling method in field plots (light blue), sampled and not thinned stands (dark blue), unsampled and not thinned stands (light blue) and unsampled and thinned stands (red). For each group the area of overlap, AO, with the density function for the field plots (green) is provided for each predictor. Figure C3. Comparison of density functions for the predictors in the models used to estimate changes in Biomass using the -modeling method and -modeling method in field plots (light blue), sampled and not thinned stands (dark blue), unsampled and not thinned stands (light blue) and unsampled and thinned stands (red). For each group the area of overlap, AO, with the density function for the field plots (green) is provided for each predictor. Figure A5. Comparison of density functions for the predictors in the models used to estimate changes in Biomass using the δ-modeling method and y-modeling method in field plots (light blue), sampled and not thinned stands (dark blue), unsampled and not thinned stands (light blue) and unsampled and thinned stands (red). For each group the area of overlap, AO, with the density function for the field plots (green) is provided for each predictor.