A Comparison of Imputation Approaches for Estimating Forest Biomass Using Landsat Time-Series and Inventory Data

: The prediction of forest biomass at the landscape scale can be achieved by integrating data from ﬁeld plots with satellite imagery, in particular data from the Landsat archive, using k-nearest neighbour (kNN) imputation models. While studies have demonstrated different kNN imputation approaches for estimating forest biomass from remote sensing data and forest inventory plots, there is no general agreement on which approach is most appropriate for biomass estimation across large areas. In this study, we compared several imputation approaches for estimating forest biomass using Landsat time-series and inventory plot data. We evaluated 18 kNN models to impute three aboveground biomass (AGB) variables (total AGB, AGB of live trees and AGB of dead trees). These models were developed using different distance techniques (Random Forest or RF, Gradient Nearest Neighbour or GNN, and Most Similar Neighbour or MSN) and different combinations of response variables (model scenarios). Direct biomass imputation models were trained according to the biomass variables while indirect biomass imputation models were trained according to combinations of forest structure variables (e.g., basal area, stem density and stem volume of live and dead-standing trees). We also assessed the ability of our imputation method to spatially predict biomass variables across large areas in relation to a forest disturbance history over a 30-year period (1987–2016). Our results show that RF consistently outperformed MSN and GNN distance techniques across different model scenarios and biomass variables. The lowest error rates were achieved by RF-based models with generalized root mean squared difference (gRMSD, RMSE divided by the standard deviation of the observed values) ranging from 0.74 to 1.24. Whereas gRMSD associated with MSN-based and GNN-based models ranged from 0.92 to 1.36 and from 1.04 to 1.42, respectively. The indirect imputation method generally achieved better biomass predictions than the direct imputation method. In particular, the kNN model trained with the combination of basal area and stem density variables was the most robust for estimating forest biomass. This model reported a gRMSD of 0.89, 0.95 and 1.08 for total AGB, AGB of live trees and AGB of dead trees, respectively. In addition, spatial predictions of biomass showed relatively consistent trends with disturbance severity and time since disturbance across the time-series. As the kNN imputation method is increasingly being used by land managers and researchers to map forest biomass, this work helps those using these methods ensure their modelling and mapping practices are optimized.


Introduction
Forest biomass is considered a key factor in carbon and water cycles in terrestrial ecosystems. Spatial estimates of forest biomass are therefore needed for understanding the sources and sinks of terrestrial carbon and for accomplishing scientific and practical tasks in forest management [1]. National and international reporting of forest biomass and carbon has traditionally relied upon field measurements from National Forest Inventory (NFI) programs [2]. However, these approaches cannot provide a continuous spatial distribution of forest biomass at the landscape scale, since field measurements often have limited spatial and temporal coverage, especially for large jurisdictions or remote areas [3]. To address this problem, researchers and practitioners often combine field measurements with remote sensing data to estimate forest biomass across large areas. Many studies have recently demonstrated the ability of lidar (light detection and ranging) data in providing high accuracy estimates of forest biomass and structure [4][5][6][7][8][9][10]. Lidar can be integrated with forest inventory data to make lidar-based biomass maps where data is available wall-to-wall [7][8][9]11,12], or to create lidar-based plots to support forest inventory where data are discrete available as sample transects [13,14]. Lidar-based plots can then be fused with larger coverage data, such as satellite imagery, to facilitate mapping forest biomass and structure at the land management scale [15][16][17][18][19][20]. However, while lidar is often available over forests in developed regions such as North America, such data is generally not available in developing regions, due to its high acquisition costs and advanced computational requirements. The immediate need of biomass estimations across large forest areas, therefore, relies on field-based inventories and multi-spectral remote sensing data provided by satellites such as Landsat and Sentinel.
The free availability of the entire historic Landsat archive (since 2008) makes it a popular remotely sensed data source for mapping current forest biomass, as well as monitoring forest biomass dynamics across space and time. The Landsat archive provides a 40-year collection of satellite imagery (since 1972) at a spatial resolution sufficient for capturing changes in forests [21]. This has facilitated the development of many approaches that utilize Landsat time-series imagery to characterize forest change [22,23]. Information from Landsat time-series has been widely used in many studies to estimate forest biomass and other structure attributes across large areas (e.g., [15,16,19,20,[24][25][26][27]). At the conceptual level, Kennedy, Ohmann, Gregory, Roberts, Yang, Bell, Kane, Hughes, Cohen, Powell, Neeti, Larrue, Hooper, Kane, Miller, Perkins, Braaten and Seidl [24] developed a comprehensive forest biomass monitoring framework that is based on the analysis of Landsat time-series. Studies have also demonstrated the utility of Landsat time-series in improving forest biomass and structure estimates in comparison with methods that conventionally rely on single-date Landsat images [6,9,16,28]. For example, Pflugmacher, Cohen and E. Kennedy [9] indicated that the inclusion of spectral disturbance and recovery metrics extracted from Landsat time-series can improve biomass model results. More recently, Bolton, White, Wulder, Coops, Hermosilla and Yuan [16] demonstrated that time series metrics such as long-term Landsat spectral means and variability can also describe long-term forest dynamics. The inclusion of time series metrics not only improves the accuracy of empirical models but also makes spatial predictions of forest attributes more consistent with ecological changes such as those resulting from forest disturbance and recovery processes [19].
In forest mapping applications, k-Nearest Neighbour (kNN) imputation has been commonly used to leverage forest attributes of interest (response variables), derived from sample plot data, with spatial metrics (predictor variables), derived from remote sensing data, to generate spatial predictions of forest attributes (e.g., [15,16,19,24,29,30]). kNN imputation is a non-parametric and multivariate modelling method that aims to impute (or share) values of response variables from measured samples to target samples where response variables have not been observed (e.g., pixels covering an area of interest) [31]. Imputation associated with each target sample is based on the similarity (evaluated using a distance metric) between values of predictor variables associated with that target pixel and those associated with k nearest training samples. The imputed value of each target sample is assigned as the observed value of a particular training sample if k = 1, and as the average of observed values Remote Sens. 2018, 10, 1825 3 of 22 associated to k nearest training samples if k > 1. Using k values greater than 1 often produces more accurate imputation results than using k = 1, but increases the bias/variance between imputed and observed values, which is an important consideration when comparing the performance of imputation models [31,32]. There are several common techniques to derive a nearest neighbour distance metric (hereafter, distance technique), including weighted Euclidean distance based techniques, such as Most Similar Neighbour (MSN) [33], or Gradient Nearest Neighbour (GNN) [34], and machine learning based methods such as Random Forest (RF) [35]. In fact, MSN is the configuration consisting of the canonical correlation analysis and k = 1 [33], and GNN is the configuration consisting of the canonical correspondence analysis and k = 1 [34]. However, several studies also used these techniques with k > 1 [36]. Some studies have compared kNN distance techniques [31,32,37]. Hudak, Crookston, Evans, Hall and Falkowski [31] compared several kNN distance techniques for imputing plot-level response variables (basal area and tree density) using airborne lidar data in small case study areas. The study found that the RF technique performed best in comparison with GNN and MSN. Eskelson, Temesgen, Lemay, Barrett, Crookston and Hudak [32] also found that RF outperformed other techniques when estimating current forest attributes from inventory data. As the kNN imputation method has been increasingly used in mapping forest biomass, it is critical for land managers and researchers to identify which distance technique performs better at regional and national scales.
Another important consideration when developing a kNN model is the method used for imputing biomass variables. This can be divided into two groups: direct and indirect imputation methods. In the direct method, forest biomass variables are included in kNN models as response variables and are thus directly imputed based on their relationship with predictor variables [30,[37][38][39][40]. In the indirect method, however, kNN models are trained by response variables other than biomass [15,19,24]. These response variables are often forest structure attributes extracted from inventory plots, such as basal area, stem density and stem volume. Alternatively, variables may be measured from lidar-based plots such as mean vegetation height and percentage of returns [19]. Biomass variables are then attached as ancillary variables to the imputation predictions for target samples. That is, the nearest neighbours for each target sample are found based on the relationship between structure attributes and predictor variables and then are indirectly transferred to biomass variables. For instance, Kennedy, Ohmann, Gregory, Roberts, Yang, Bell, Kane, Hughes, Cohen, Powell, Neeti, Larrue, Hooper, Kane, Miller, Perkins, Braaten and Seidl [24], following Ohmann and Gregory [34], developed GNN imputation models to link a variety of plot-level measures including basal area by species and tree size classes with Landsat-based predictor variables. Using GNN models, each pixel was assigned tree measurements from an inventory plot, resulting in inventory-like maps, from which forest biomass was mapped. As the abovementioned studies were conducted in different locations and used different data sources, it is difficult to compare the imputation methods used for estimating forest biomass. Currently, there is no general agreement on whether direct or indirect imputation methods are better for estimating forest biomass from remote sensing data.
The main aim of this study is to determine the most accurate imputation approaches for mapping forest biomass at the landscape scale, using remote sensing and inventory data, and in doing so, fill current gaps in the literature. To achieve this, we compare the performance of several imputation approaches used for modelling aboveground forest biomass (AGB) over large areas using Landsat time-series and field plot measurements. In particular, we compare and contrast (1) direct and indirect biomass imputation methods and (2) three commonly used kNN distance techniques (RF, GNN and MSN), for predicting three forest biomass variables (total AGB, AGB of live-standing trees, AGB of dead-standing trees). In addition, we demonstrate and assess the utility of the kNN imputation method for spatially predicting biomass variables across large areas in relation to ecological changes (a 30-year history of forest disturbance, Nguyen, Jones, Soto-Berelov, Haywood and Hislop [23]).

Materials and Methods
The main steps implemented in this study are summarized in Figure 1. In summary, we derived a series of forest biomass (total AGB, AGB of live and dead-standing trees) and structure variables (e.g., basal area, stem density) from 633 forest inventory plots. As predictor variables, we calculated Landsat time-series based metrics across 19 Landsat tiles plus topographic and climatic variables. We extracted the predictor variables for each plot location and then developed, compared, and evaluated different kNN imputation approaches. Finally, we assessed the ability of the optimal kNN model to predict forest biomass in relation to disturbance history over a 30-year time-series . A more detailed description follows below.
Remote Sens. 2018, 10, x FOR PEER REVIEW 4 of 32 The main steps implemented in this study are summarized in Figure 1. In summary, we derived a series of forest biomass (total AGB, AGB of live and dead-standing trees) and structure variables (e.g., basal area, stem density) from 633 forest inventory plots. As predictor variables, we calculated Landsat time-series based metrics across 19 Landsat tiles plus topographic and climatic variables. We extracted the predictor variables for each plot location and then developed, compared, and evaluated different kNN imputation approaches. Finally, we assessed the ability of the optimal kNN model to predict forest biomass in relation to disturbance history over a 30-year time-series . A more detailed description follows below.

Study Area
The study area contains the whole public land forest estate (7.1 million hectares) of Victoria, southeast Australia ( Figure 2). Victorian public land forests extend across the state and include two main land tenures: state forests, and national parks and conservation reserves [41]. The area is stratified by eleven main bioregions (Interim Biogeographic Regionalization for Australia or IBRA, Figure 2), each of which has distinct ecological, geological and climatological features [41].
Victorian public forests include a wide range of ecosystems. North-western Victoria is mostly covered by mallee ecosystems, characterised by small woodland (up to 8 m tall) and multi-stemmed forests, on flat to undulating landscapes. In contrast, most ecosystems in the central part of the State are box-ironbark and red gum forests, which have dense to sparse canopies and reach up to 25 m in height. They are found on flat to undulating topography on rocky and auriferous soils. The most widespread and variable forest ecosystem in the study area consists of damp sclerophyll forests, which cover the central and eastern parts of Victoria. In this ecosystem, trees grow on loam, clay loam, and sandy loam soils and heights range from 40 m to 60 m. Distributed mostly in the eastern part of the state are wet and dry sclerophyll forests. The former are the tallest forest ecosystems in Victoria, with trees reaching 75 m or more in height while the later include relatively low and

Study Area
The study area contains the whole public land forest estate (7.1 million hectares) of Victoria, southeast Australia ( Figure 2). Victorian public land forests extend across the state and include two main land tenures: state forests, and national parks and conservation reserves [41]. The area is stratified by eleven main bioregions (Interim Biogeographic Regionalization for Australia or IBRA, Figure 2), each of which has distinct ecological, geological and climatological features [41].
Victorian public forests include a wide range of ecosystems. North-western Victoria is mostly covered by mallee ecosystems, characterised by small woodland (up to 8 m tall) and multi-stemmed forests, on flat to undulating landscapes. In contrast, most ecosystems in the central part of the State are box-ironbark and red gum forests, which have dense to sparse canopies and reach up to 25 m in height. They are found on flat to undulating topography on rocky and auriferous soils. The most widespread and variable forest ecosystem in the study area consists of damp sclerophyll forests, which cover the central and eastern parts of Victoria. In this ecosystem, trees grow on loam, clay loam, and sandy loam soils and heights range from 40 m to 60 m. Distributed mostly in the eastern part of the state are wet and dry sclerophyll forests. The former are the tallest forest ecosystems in Victoria, with trees reaching 75 m or more in height while the later include relatively low and spreading trees that reach a maximum height of 25 m [42]. Forests within the study area have been impacted by a series of disturbance events including fuel reduction burns, wildfires, logging and drought, resulting in significant changes in carbon biomass stocks [23].
Remote Sens. 2018, 10, x FOR PEER REVIEW 5 of 32 spreading trees that reach a maximum height of 25 m [42]. Forests within the study area have been impacted by a series of disturbance events including fuel reduction burns, wildfires, logging and drought, resulting in significant changes in carbon biomass stocks [23].

Forest Biomass and Structure Response Variables
Forest inventory plot data was collected as a part of the Victorian Forest Monitoring Program (VFMP) which is implemented by the Department of Environment, Land, Water, and Planning [43]. The VFMP includes a network of 786 permanent ground circular plots (0.04 ha) randomly distributed across a systematic statewide grid ( Figure 2). In each plot, inventory data is collected on large trees, small trees, herbs, shrubs, and woody debris. Further descriptions of sample designs and field measurements can be found in Haywood, Mellor and Stone [43].
In this study, we extracted data from 633 VFMP plots measured between 2011 and 2016 and calculated nine tree-level forest biomass and structure variables (Table 1). Following Haywood and Stone [44], we estimated AGB (Mg·ha −1 ) of all large live-standing trees (AGBlive_tree, ≥10 cm in diameter at breast height-DBH) using a generic allometric equation for sclerophyll forests [45] of the form:

Forest Biomass and Structure Response Variables
Forest inventory plot data was collected as a part of the Victorian Forest Monitoring Program (VFMP) which is implemented by the Department of Environment, Land, Water, and Planning [43]. The VFMP includes a network of 786 permanent ground circular plots (0.04 ha) randomly distributed across a systematic statewide grid ( Figure 2). In each plot, inventory data is collected on large trees, small trees, herbs, shrubs, and woody debris. Further descriptions of sample designs and field measurements can be found in Haywood, Mellor and Stone [43].
In this study, we extracted data from 633 VFMP plots measured between 2011 and 2016 and calculated nine tree-level forest biomass and structure variables (Table 1). Following Haywood and Stone [44], we estimated AGB (Mg·ha −1 ) of all large live-standing trees (AGB live_tree , ≥10 cm in Remote Sens. 2018, 10, 1825 6 of 22 diameter at breast height-DBH) using a generic allometric equation for sclerophyll forests [45] of the form: where BLT i is the AGB of large live-standing tree i. AGB of large dead-standing trees (AGB dead_tree ) was calculated by subtracting the amount of biomass in leaves, twigs and branches from AGB of live-standing trees. Total aboveground woody biomass (AGB total ) was calculated as the sum biomass of live-and dead-standing trees (AGB live_tree and AGB dead_tree ), small trees (<10 cm DBH), stumps, slash and coarse woody debris (all fallen dead woody material, Haywood and Stone [44]). In addition to biomass variables, we calculated some structural measures that are summaries of tree measurements in each plot, including basal area (BA), stem density (TD, number of trees per hectare), and tree volume (VL) of both live-and dead-standing trees ( Table 1).

Predictor Variables
The candidate predictor variables included Landsat-based metrics as well as topographic and climatic ancillary data ( Table 2). Landsat-based variables consisted of spectral indices and change metrics derived from the analysis of Landsat time-series.

Landsat Time-Series
The study area is covered by 19 Landsat WRS-2 tiles ranging from row 84 to 87, and path 90 to 96 ( Figure 2). From the USGS archive, we processed all available Level-1 Terrain Corrected (L1T) Landsat TM/ETM+ images acquired within a pre-defined southern hemisphere summer window (from December to the end of February) from 1987 to 2016. Annual anniversary-date, best observation mosaic composites were created for the 30-year time period using all cloud-free observations within the summer window. For more details on image processing, please see Nguyen, Jones, Soto-Berelov, Haywood and Hislop [23]. For each composite of surface reflectance, we calculated the Normalized Burn Ratio index (NBR) [46], and the Tasselled-cap (TC) components of Greenness (TCG), Brightness (TCB), and Wetness (TCW) [47]. We also computed TC angle (TCA = arctan (TCG/TCB), Powell, Cohen, Healey, Kennedy, Moisen, Pierce and Ohmann [38]) and TC distance (TCD = TCG 2 + TCB 2 , Duane, et al. [48]).
To derive spectral trends for each pixel, we analysed annual NBR time-series using the LandTrendr temporal segmentation algorithm developed by Kennedy, et al. [49]. Similar to other studies (e.g., [49][50][51][52]), we found that NBR was the most sensitive spectral index for capturing forest disturbance from Landsat time-series [53]. The core segmentation process includes two steps: finding vertices and fitting trends. The first step establishes the vertex years that define temporal breakpoints, reducing year to year noise over the time series. The second step determines the best straight-line trajectory that fits through those vertices, resulting in a fitted spectral trajectory for each pixel within the processing area ( Figure 3). arctan (TCG/TCB), Powell, Cohen, Healey, Kennedy, Moisen, Pierce and Ohmann [38]) and TC distance (TCD = √TCG + TCB , Duane, et al. [48]).
To derive spectral trends for each pixel, we analysed annual NBR time-series using the LandTrendr temporal segmentation algorithm developed by Kennedy, et al. [49]. Similar to other studies [e.g., 49,50,51,52]), we found that NBR was the most sensitive spectral index for capturing forest disturbance from Landsat time-series [53]. The core segmentation process includes two steps: finding vertices and fitting trends. The first step establishes the vertex years that define temporal breakpoints, reducing year to year noise over the time series. The second step determines the best straight-line trajectory that fits through those vertices, resulting in a fitted spectral trajectory for each pixel within the processing area ( Figure 3).  From the NBR fitted trajectories, we computed a suite of change metrics representing disturbance and recovery trends for each pixel ( Table 2). We first labelled the greatest disturbance segment which corresponds with the greatest negative change magnitude. We then identified the subsequent recovery segment as the monotonic positive segment following the greatest disturbance ( Figure 3). More details can be found in Nguyen, Jones, Soto-Berelov, Haywood and Hislop [23]. These two segments were then used to derive disturbance and recovery metrics that represent year, duration, and spectral magnitude of change, as well as pre-and post-change spectral conditions [19,20,23,50]. We also calculated the number of years since the greatest disturbance (or time since disturbance, TSD) to the last year of the time-series (2016). However, for this metric, we applied an additional rule that set the TSD of all un-changed pixels to 50 years. TSD is an important predictor of the regrowth of forests following disturbance. However, since we only had disturbance data for the last 30 years, we assigned non-disturbed pixels a value of 50, to represent mature forests. This was done so as not to confuse the model with a value of 0, for example, which might be interpreted incorrectly. Although 50 was a somewhat arbitrary value, in these forests, it was considered sufficient to represent mature forests. Using disturbance and recovery metrics, we developed classification models to predict disturbance severity levels (high, medium and low; with a high level disturbance resulting in the full removal of trees in forests) and associated causal agents [23]. These data were also included as predictor variables in biomass models in this study, resulting in 15 change metrics in total. A spatial filtering process was applied to the derived change metrics to select only pixels within forested areas and remove change events smaller than 0.5 ha [23].

Topographic and Climatic Ancillary Data
Topographic variables including elevation and slope were derived from the Shuttle Radar Topography Mission (SRTM) 1 arc-second resolution (~30 m) dataset [54]. Mean annual temperature and mean total rainfall were extracted from the Global Climate Data (WorldClim) with a spatial resolution of 1 km [55]. These datasets were obtained and resampled to a spatial resolution of 30 m, to match that of Landsat.

Variable Extraction
For each inventory plot location, we extracted the values from the prepared predictor variables (Table 2)  values from a kernel size, such as a 3 × 3 pixel window, minimises the spatial mismatch between spatial data and inventory plots, the use of the single Landsat pixel allows us to capture the specific disturbance history associated with each inventory plot. Earlier research has shown that a disturbance may impact pixels within a plot in different ways [23]. For example, a fire may have a higher severity in a given pixel than its adjacent pixels. To reduce the mismatch between plot condition at the time of field measurement and time of image acquisition, the Landsat-index variables were extracted from the composite images with the dates that closely coincided with plot measurement dates. Values of change metrics for each disturbed plot were calculated for the time period from 1987 to the plot observation year. In addition, we identified and removed potential outliers to improve data quality. Outliers were defined as plots associated with edge effects such as adjacent water or roads. This removed approximately 8% of the of inventory plots.

Variable Selection
As reducing the number of predictor variables can avoid detrimental impacts on kNN imputation accuracy, the preliminary modelling step was to determine an optimal set of predictor variables. To achieve this, we first ran the least absolute shrinkage and selection operator (LASSO, Efron, et al. [56]) model to rank all predictor variables based on their importance to each response variable. The LASSO model quantifies the strength of the relationship between predictor variables and response variables using a pseudo R 2 metric that ranges from 0 to 1. The importance rankings of predictor variables for individual response variables were made based on this metric. Predictor variables with consistently low rankings were excluded from further analyses. Redundant variables (or highly correlated variables) were also identified and removed by calculating Pearson correlation coefficients between all pairs of remaining variables and removing those with r > 0.9.

Imputation Models
We developed and compared different kNN (with k = 1) imputation approaches to predict three biomass measurements (AGB total , AGB live_tree , AGB dead_tree ) based on several combinations of response variables and distance techniques. For each of the three distance techniques (RF, GNN, or MSN), we developed six model scenarios using different groups of response variables (Table 3), resulting in 18 kNN models in total. The first model scenario (BM) was the direct biomass imputation model since it was trained by biomass response variables (AGB total , AGB live_tree and AGB dead_tree ). The nearest neighbour was found by directly relating observed biomass variables to predictor variables. In contrast, the other five model scenarios (BA, TD, VL, BA-TD and VL-TD) were the indirect biomass imputation models, as the nearest neighbour was found based on the relationships between predictor variables and forest structure variables rather than biomass variables. Biomass measurements of the corresponding training plots were not included in these models but were subsequently attached as ancillary variables to impute each target pixel. The combination of BA and VL was not included in our analysis since these variables are highly correlated. The Pearson correlation coefficient between BA live_tree and VL live_tree was 0.91 and between BA dead_tree and VL dead_tree was 0.96.
GNN and MSN identify the nearest neighbour based on weighted Euclidean distance techniques. The MSN technique computes the distance in projected canonical space while the GNN technique computes distance using a projected ordination of predictors based on canonical correspondence analysis (CCA) [33,34]. The distance metric in RF, on the other hand, is derived based on a proximity matrix [35]. The elements of the proximity matrix contain the proportion of decision trees where both training and target samples are found in the same terminal node. The statistical distance metric is calculated as one minus that proportion [57]. After testing with different model parameters, we set the number of trees (ntree) to 200 for each RF model (associated with each response variable), and the number of predictor variables selected at each node (mtry) to the default, based on the square root of the number of predictors. These values were chosen since they minimized the model errors (RMSE) within the training dataset. Since kNN imputation creates multiple RF models associated with the number of response variables and shares the number of trees across the individual models, the input parameter of ntree of kNN model was actually 200 multiplied by the number of response variables. We built our models using the R-package yaImpute [57]. Table 3. Model scenarios developed for each distance technique (BM = biomass, BA = basal area, TD = stem density, VL = tree volume, X = denotes response variable group in each model).

Model Evaluation
We evaluated the accuracy of each model scenario using a leave-one-out cross validation approach. For each sample plot, models were trained by all data except the candidate plot which was then treated as a target observation. Errors were computed for each withheld sample and averaged to evaluate model performance. For each biomass variable, imputed values were compared to observed values using the generalised root mean square difference (gRMSD, RMSE divided by the standard deviation of the observed values under the assumption that they are representative of the population, Crookston and Finley [57]), and relative mean deviation (rMD, Gorard [58]) which is a measure of bias: where x i and x i are the imputed (predicted) and observed biomass value, respectively, of the ith sample, and x is the mean of observed values.

Assessment of Biomass Imputation Maps Using Disturbance History
After evaluating the accuracy metrics, we selected the kNN model that consistently performed well across the biomass response variables, to implement spatial imputations (k = 1) to forested pixels across the study area for the year 2016. We assessed the ability of the selected kNN model to predict forest biomass in relation to disturbance history throughout the 30-year time-series (1987-2016). To facilitate this, we created a reference dataset containing 7860 reference points (with a minimum distance of 250 m between them) across the study area [59]. These points were built around the VFMP inventory plot network (10 points around each plot), thus they were also stratified according to bioregions (Figure 2). Points that fell on the boundary of land cover types, or at the edge of disturbances were shifted away from the edge to avoid mis-registration errors. Reference points were then interpreted and attributed with disturbance severity levels (high, medium and low), and associated causal agents (fire, logging and other) using a multiple-lines of evidence approach. For a detailed explanation of the multiple lines of evidence approach, see Soto-Berelov, Haywood, Jones, Hislop and Nguyen [59]. Some points that fell in non-forest areas, defined following Nguyen, et al. [60], were removed from the dataset.
From the kNN imputed maps, we extracted the mean values from a 3 × 3 pixel window around each reference pixel, for each of the biomass response variables. As the range of biomass often varies across forest ecosystems, it is not appropriate to compare reference points from different bioregions using imputed biomass values. Thus, we grouped the random points by bioregion and then scaled biomass values for each set from 0 to 100 (corresponding with low to high biomass in each bioregion).
To examine the relationship between scaled biomass values and recent disturbance events, we selected reference points that had been disturbed by fire and logging events that occurred between 2013 and 2016. The main reason for choosing this time period is that, for non-stand replacing disturbance events occurring in over three years prior to the mapping date (2016), the evidence of their impacts on biomass maps may not be clear as forests may have regrown [61]. Trends and variation of biomass associated with these points were then analysed according to disturbance severity level and causal agent using boxplots. In addition, we assessed biomass patterns according to time since disturbance or TSD, for each disturbance severity level. This analysis was completed for all disturbances occurring within the time-series (1987-2016). We first stratified TSD into nine intervals (0-3, 3-6, 6-9, 9-12, 12-15, 15-18, and 18-26 years). The basis of this division was to illuminate the trend of biomass according to TSD. A shorter interval (3 years) was applied for TSD smaller than 18 years since forests often significantly change/grow during this period. We then grouped the disturbance points using these TSD intervals and by disturbance severity levels, and calculated the mean of the scaled biomass values for each group.

Variable Selection
The LASSO model reported that spectral indices (excepting TCB and TCD), climatic and topographic variables were the most important predictor variables related to our response variables ( Figure 4). Pre-and post-disturbance values and relative disturbance magnitude were the most important change metrics. Variables such as disturbance and recovery onset year and duration had consistently low importance rankings and thus were excluded from kNN biomass models. Although disturbance severity levels and causal agents had relatively low importance, these variables were included in biomass models since they have been effective in other studies [15,16]. TCB and TCD were relatively important but were excluded as redundant variables, based on having high correlations. Nineteen remaining variables were finally selected to use in the kNN biomass models. The LASSO model reported that spectral indices (excepting TCB and TCD), climatic and topographic variables were the most important predictor variables related to our response variables ( Figure 4). Pre-and post-disturbance values and relative disturbance magnitude were the most important change metrics. Variables such as disturbance and recovery onset year and duration had consistently low importance rankings and thus were excluded from kNN biomass models. Although disturbance severity levels and causal agents had relatively low importance, these variables were included in biomass models since they have been effective in other studies [15,16]. TCB and TCD were relatively important but were excluded as redundant variables, based on having high correlations. Nineteen remaining variables were finally selected to use in the kNN biomass models.

Biomass Imputation Model Accuracy
The results of biomass model assessment varied across model scenarios, imputation techniques (RF, GNN and MSN) and the biomass variables (AGBtotal, AGBlive_tree and AGBdead_tree). Figure 5 shows the pattern of gRMSD across all tested models. It clearly indicates that RF consistently outperformed

Biomass Imputation Model Accuracy
The results of biomass model assessment varied across model scenarios, imputation techniques (RF, GNN and MSN) and the biomass variables (AGB total , AGB live_tree and AGB dead_tree ). Figure 5 shows the pattern of gRMSD across all tested models. It clearly indicates that RF consistently outperformed MSN and GNN distance techniques in terms of gRMSD. Across biomass models, AGB total consistently achieved lower gRMSD values (ranging from 0.74 to 1.34) than AGB live_tree (0.88-1.39) and AGB dead_tree (1.08-1.42). The lowest gRMSD values across biomass variables were reported by RF-based BM, BA and VL models, while the highest values were reported by GNN-based BA and TD models. In addition, model scenarios performed differently across imputation techniques. For example, the BM model scenario achieved the lowest errors when using RF and MSN but not when using GNN. The BA model scenario showed better performance than the other model scenarios when using RF but it was one of the worst performings when using GNN. The VL-TD model scenario reported lower error rates than the other model scenarios when using MSN or GNN techniques but higher when using RF. TD was the worst performing model scenario regardless of distance techniques. The patterns in model-strength indicated by rMD were relatively similar to those illustrated by gRMSD ( Figure 6). RF performed much better than the other distance techniques and rMD values were generally lower for AGBtotal and AGBlive_tree (ranging from 0.26 to 6.44 for AGBtotal, from 0.03 to 6.46 for AGBlive_tree, and from 0.55 to 6.45 for AGBdead_tree). The lowest values associated with AGBtotal and AGBlive_tree were achieved by the RF-based BA model while the lowest value associated with AGBdead_tree was obtained by the RF-based TD model. The highest values were reported by GNN-based BA and TD models. Similar to gRMSD, the patterns of rMD indicate that model scenarios show varied performances across distance techniques and biomass variables. When using RF, the BA model scenario reported lower error rates for AGBtotal and AGBlive_tree (0.26 and 0.03, respectively) than the other model scenarios. When using MSN and GNN distance techniques, however, lower error rates were reported by TD and VL-TD model scenarios, respectively. In addition, models associated with low bias for AGBtotal and AGBlive_tree often reported relatively high bias for AGBdead_tree, for example 4.84 for the RF-based BA model and 4.42 for the MSN-based TD model. The patterns in model-strength indicated by rMD were relatively similar to those illustrated by gRMSD ( Figure 6). RF performed much better than the other distance techniques and rMD values were generally lower for AGB total and AGB live_tree (ranging from 0.26 to 6.44 for AGB total , from 0.03 to 6.46 for AGB live_tree , and from 0.55 to 6.45 for AGB dead_tree ). The lowest values associated with AGB total and AGB live_tree were achieved by the RF-based BA model while the lowest value associated with AGB dead_tree was obtained by the RF-based TD model. The highest values were reported by GNN-based BA and TD models. Similar to gRMSD, the patterns of rMD indicate that model scenarios show varied performances across distance techniques and biomass variables. When using RF, the BA model scenario reported lower error rates for AGB total and AGB live_tree (0.26 and 0.03, respectively) than the other model scenarios. When using MSN and GNN distance techniques, however, lower error rates were reported by TD and VL-TD model scenarios, respectively. In addition, models associated with low bias for AGB total and AGB live_tree often reported relatively high bias for AGB dead_tree , for example 4.84 for the RF-based BA model and 4.42 for the MSN-based TD model.

Biomass Imputation Maps in Relation to Disturbance History
Based on model comparisons, we selected the model trained by the combination of basal area and stem density variables (RF-based BA-TD model) to produce imputation maps of biomass for 2016, across the study area (Figure 7). This model reported relatively and consistently low error rates across both accuracy metrics (gRMSD and rMD) and all three biomass variables ( Figure 8). For all biomass variables, the model over-and under-predicted low and high observed biomass values, respectively. The imputation maps of forest biomass show clear longitudinal trends at the state level, with lower biomass in northwestern mallee forests and higher in southeast sclerophyll forests (  Figure 7a). At the local scale, biomass predictions are spatially consistent with ground conditions of forests, accurately capturing evidence of recent forest disturbance (  Figure 7b-e).
Predictions of biomass showed relatively consistent trends in relation to the severity of recent (2013-2016) fire and logging disturbance events ( Figure 9). For all three biomass variables, high severity disturbance consistently resulted in low biomass predictions. Scaled biomass values were generally higher when relating to lower disturbance levels (medium and low severity), excepting small trends in AGBtotal and AGBlive_tree associated with low severity logging. In addition, fire disturbed areas indicated slightly lower scaled biomass values in comparison with logged areas.
The trends of biomass predictions in relation to TSD varied across disturbance severity levels and biomass variables ( Figure 10). Similar to the analysis on recent disturbances, the analysis on all disturbances within the 30-year time-series indicated that biomass predictions associated with low severity disturbance were generally higher than those associated with medium and high severity disturbance. For high and medium severity levels, scaled biomass values were generally higher with increased TSD. For the low severity level, on the other hand, scaled biomass values were generally stable across TSD intervals. However, these trends were not linear. There was a substantial increase in biomass predictions within the 3-6 and 6-9 years TSD intervals across all disturbance severity levels and biomass variables.

Biomass Imputation Maps in Relation to Disturbance History
Based on model comparisons, we selected the model trained by the combination of basal area and stem density variables (RF-based BA-TD model) to produce imputation maps of biomass for 2016, across the study area (Figure 7). This model reported relatively and consistently low error rates across both accuracy metrics (gRMSD and rMD) and all three biomass variables ( Figure 8). For all biomass variables, the model over-and under-predicted low and high observed biomass values, respectively. The imputation maps of forest biomass show clear longitudinal trends at the state level, with lower biomass in northwestern mallee forests and higher in southeast sclerophyll forests (Figure 7a). At the local scale, biomass predictions are spatially consistent with ground conditions of forests, accurately capturing evidence of recent forest disturbance (Figure 7b-e).
Predictions of biomass showed relatively consistent trends in relation to the severity of recent (2013-2016) fire and logging disturbance events (Figure 9). For all three biomass variables, high severity disturbance consistently resulted in low biomass predictions. Scaled biomass values were generally higher when relating to lower disturbance levels (medium and low severity), excepting small trends in AGB total and AGB live_tree associated with low severity logging. In addition, fire disturbed areas indicated slightly lower scaled biomass values in comparison with logged areas.
The trends of biomass predictions in relation to TSD varied across disturbance severity levels and biomass variables ( Figure 10). Similar to the analysis on recent disturbances, the analysis on all disturbances within the 30-year time-series indicated that biomass predictions associated with low severity disturbance were generally higher than those associated with medium and high severity disturbance. For high and medium severity levels, scaled biomass values were generally higher with increased TSD. For the low severity level, on the other hand, scaled biomass values were generally stable across TSD intervals. However, these trends were not linear. There was a substantial increase in biomass predictions within the 3-6 and 6-9 years TSD intervals across all disturbance severity levels and biomass variables. Remote Sens. 2018, 10, x FOR PEER REVIEW 14 of 23

Discussion
Studies have demonstrated different kNN imputation approaches for the empirical estimation of forest AGB using remotely sensed time-series and forest inventory data. However, different approaches can produce markedly different results. Thus, knowledge of which approach is the most appropriate is needed for land managers and researchers. In this study, we compared several kNN-based imputation approaches for estimating forest biomass using Landsat time-series and field plot measurements. Specifically, we evaluated three commonly used kNN distance techniques: RF, MSN and GNN, and two biomass imputation methods: direct and indirect imputation. While a few studies have compared the performance of different distance techniques used in kNN imputation models [31,32], none of them conducted their comparisons by leveraging multispectral time-series data and forest inventory data. Furthermore, there is no existing literature demonstrating whether biomass variables should be included as response variables and directly imputed, or be indirectly imputed from models built upon other structure variables. In addition, we evaluated the utility of kNN imputation models built upon Landsat time-series and inventory data by conducting ecological validations of imputed biomass maps.
Our results indicate that the accuracy of kNN biomass imputation models (k = 1) varies with different distance metrics. Models based on the RF distance technique, which calculates the distance metric based on a proximity matrix, generally outperformed those based on MSN and GNN distance techniques. As shown in Figures 5 and 6, RF-based models consistently achieved lower error rates of biomass imputation (in both gRMSD and rMD values) as compared to MSN and GNN-based models. These results agree with previous studies that compared imputation techniques for modelling forest attributes [31,32]. In addition, we found the performance of RF-based models to be relatively stable while that of MSN-and GNN-based models tended to be inconsistent across varied model scenarios. The rMD value associated with AGB total reported by RF-based models ranged from 0.26 to 0.54, while the range reported by MSN and GNN-based models were significantly higher (from 0.45 to 2.59 and from 1.87 to 6.44, respectively, Figure 6).
The results also indicate the influence of the number of response variables included in kNN biomass imputation models on the performance of different distance techniques. In general, RF-based models were not significantly impacted by the number of response variables, given the models with four response variables (RF-based BA-TD and VL-TD models) achieved comparable accuracies to those with two variables (such as RF-based BA and VL models). This could be due to the small number of response variables (2 to 4) included in our RF-based models. Previous studies used a larger number of variables and found that RF works optimally with few variables and when factors are used rather than continuous values [31,57]. In contrast to RF, GNN-based models performed significantly better with an increased number of response variables. Error rates reported by GNN-based models were highest for models with two response variables (such as GNN-based BA and TD models) and lowest for those with four response variables (GNN-based BA-TD and VL-TD models). This compares favourably with other studies [34,62]. The performance of MSN-based models varied with the number of response variables, making it difficult to identify a specific trend. As our models included a relatively small number of response variables, further work is needed to investigate the impact of the number of response variables on the performance of kNN imputation with different distance techniques.
Among indirect biomass imputation model scenarios (BA, VL, TD, BA-TD, and VL-TD), models trained by basal area or stem volume variables generally achieved better accuracy than those trained by stem density variables (Figures 5 and 6). This was expected, since basal area and stem volume variables are often more correlated with the biomass variables than stem density variables. However, models with only basal area or stem volume response variables (BA and VL scenarios) often produced unbalanced accuracies across biomass ranges, exhibiting high bias for the dead biomass variable (AGB dead_tree , Figure 6). The inclusion of stem density variables (BA-TD and VL-TD scenarios) significantly reduced this bias, balancing accuracy across all biomass variables. rMD values associated with AGB dead_tree reported by BA and VL models ranged from 2.12 to 6.35 while those reported by BA-TD and VL-TD models ranged from 0.71 to 3.27 (with an exception of 5.54 from GNN-based VL-TD model).
The determination of whether direct or indirect imputation method is better for forest biomass estimation depends on the distance technique used in the kNN model. Although the direct biomass imputation model (BM model scenario) performs relatively well across different distance techniques, it is not always the best method for estimating biomass variables. When the MSN distance technique is applied, the BM model scenario performed better overall than the indirect model scenarios given it consistently reports lower errors for all three biomass variables. The next best is the BA-TD model scenario which produced slightly higher errors for total and live tree AGB but lower errors for dead tree AGB (Figures 5 and 6). When using the RF distance technique, however, the BM model scenario did not perform as well as the BA and BA-TD scenarios. Although the BM model scenario achieved the lowest gRMSD values (0.74 and 0.88 for AGB total and AGB live_tree , respectively), it produced greater bias than the BA and BA-TD model scenarios. In particular, the rMD values reported by the BM model scenario ranged from 0.48 to 6.33 while those reported by BA and BA-TD were from 0.03 to 4.84 and from 0.33 to 3.27, respectively ( Figure 6). Despite a relatively high rMD value for AGB dead_tree (4.84), it is reasonable to consider BA as the best model scenario for imputing AGB variables when applying the RF distance technique. The results from GNN-based models indicate the superior performance of VL-TD and BA-TD model scenarios. The model trained by stem volume and stem density variables (VL-TD) obtained the lowest errors for total and live tree AGB but high errors for dead tree AGB. Whereas, the model trained by basal area and stem density variables (BA-TD) achieved more consistent results across all three biomass variables (Figures 5 and 6). These results suggest that BA-TD is the most robust model scenario for imputing forest biomass given it maintains the most consistent performance across the tested distance techniques and biomass variables. Overall, the indirect imputation method, particularly kNN models trained using a combination of basal area and stem density variables, achieved better biomass estimates than the direct imputation method.
It is important to note that we developed and compared the kNN models using the single nearest neighbour (k = 1). This makes the comparison consistent and our methods and results applicable in other study areas/contexts. While increasing k (to an optimal value) reduces the imputation error, the determination of an optimal k value is often difficult and depends on many factors including distance metrics, response variables and forest environments [32,36]. Maintaining consistent parameters (i.e., k value) and methods allows the imputation results to be evaluated more effectively [32]. We note also, that the use of a single nearest neighbour has been increasing in forest applications with kNN models, particularly in biomass estimation [15,16,19,29,36,63]. Chirici, Mura, McInerney, Py, Tomppo, Waser, Travaglini and McRoberts [36] found in their review work that k = 1 is the most common selection to use with MSN and GNN techniques. This is reasonable since these techniques are initially created to use with the single nearest neighbour, as mentioned in the introduction. Recent studies using the RF technique also often selected k = 1 for their imputation models [16,19,63]. As forest inventory programs are increasingly developed systematically, measurements from inventory plots are representatives of forest populations [64]. The use of k = 1 is thus recommended to keep variance in the imputations similar to variance in the observations. Although further work is required to examine how higher k values impact the comparison results, we strongly believe that RF would outperform MSN and GNN distance techniques, regardless of the k values used.
Our results indicate that RF was the most accurate distance technique, thus the selection of the best kNN imputation model for mapping biomass variables was between the RF-based BA and BA-TD models. The former was most accurate for AGB total and AGB live_tree estimates, while the latter resulted in a more balanced performance across all three biomass variables. Our biomass maps were predicted using the BA-TD model as we aimed to focus on both live and dead biomass pools ( Figure 8). Results from the model showed that AGB total and AGB live_tree achieved better accuracies than AGB dead_tree . This supports the results of other studies, which have demonstrated that the total and live biomass variables often have better relationships with Landsat spectral values than dead biomass [6,9,65]. Spatial predictions of biomass were consistent with the distribution of different forest systems across the state (as described in Section 2.1), with low productivity in the low-spare mallee forests in the northwest and high productivity in the high-dense sclerophyll forests in the southeast (Figure 7).
Predictions of biomass were relatively consistent with the 30-year disturbance history of forests within the study area (Figures 9 and 10). Forest dynamics within the study area are dominated by fire and logging disturbances. When relating current predictions of forest biomass with recent disturbance events, we found that increased disturbance severity was consistently associated with decreased biomass predictions (Figure 9). The reduction of dead tree AGB after a high severe fire is expected given we defined a high level disturbance as the full removal of trees in forests [23]. This trend was also evident in the analysis based on all disturbances occurring from 1988 to 2016 ( Figure 10) and is consistent with findings from other studies [19,26]. In addition, the results also suggest that predicted biomass was more sensitive to fire than to logging disturbance ( Figure 9). This should be expected as un-wanted parts of logged trees (such as branches and stumps) and small trees often remained in forests after a selective logging event. Although current predictions of biomass are more variable when relating to time since disturbance (TSD), the trends were relatively consistent with post-disturbance forest recovery. Biomass values were often lowest within 0 to 3 years following a disturbance and generally reached an asymptotic level at 18-26 years after disturbance ( Figure 10). Within Victorian forests, fires often cause high rates of tree mortality, which can exist for many years after a fire, resulting in increased trends in dead tree biomass [44]. In general, biomass predictions showed relatively high values within 3-9 years after a disturbance. This trend was consistent across the biomass variables and disturbance severity levels ( Figure 10). The reason for this could be that Landsat spectra and indices are known to saturate at relatively low leaf area and biomass levels. These can be attained only a few years post-disturbance [61]. This also suggests that the uncertainty of predicted biomass maps can be informed by forest disturbance history.
Our analysis on variable selection further clarifies the benefits of including change metrics from Landsat time-series to improve predictions of forest biomass. To our knowledge, this is the first time spatial change metrics extracted from Landsat data have been combined with the systematic network of forest inventory, across large areas of sclerophyll forests in Victoria, Australia. Our results from the variable importance analysis were consistent with those from previous studies [6,9,16,19,20,63]. Spectral indices such as NBR and TCA were the most important variables overall, and change metrics such as disturbance and recovery magnitude, and TSD, were particularly important for modelling dead biomass and structure variables. Similar to Zald, Wulder, White, Hilker, Hermosilla, Hobart and Coops [19], our results indicate that change attribution variables (disturbance level and causal agent) are less important since fire is the dominant disturbance within the study area. However, these metrics may greatly benefit kNN imputation models by distinguishing pixels with similar spectral information [19,63].

Conclusions
The kNN imputation method is increasingly used to combine remote sensing data with ground sample plots to produce spatially explicit predictions of forest biomass at the landscape scale. While studies have demonstrated different kNN imputation approaches for estimating forest biomass, there is currently no consensus on which method is most appropriate when integrating multispectral time-series with field inventory data. This study addresses this gap by comparing different kNN distance techniques (with k = 1) and biomass imputation methods (direct and indirect). We found that the best results of forest biomass predictions can be achieved using the indirect imputation method rather than the direct method. In addition, our results confirm that RF outperforms GNN and MSN distance techniques in biomass imputation models. Our recommendation is that land managers and researchers should consider using a RF-based kNN imputation model that incorporates Landsat-based time-series metrics with forest structure variables (basal area and stem density) for estimating forest biomass.