Mapping Forest Canopy Fuels in the Western United States with LiDAR–Landsat Covariance

: Comprehensive spatial coverage of forest canopy fuels is relied upon by ﬁre management in the US to predict ﬁre behavior, assess risk, and plan forest treatments. Here, a collection of light detection and ranging (LiDAR) datasets from the western US are fused with Landsat-derived spectral indices to map the canopy fuel attributes needed for wildﬁre predictions: canopy cover (CC), canopy height (CH), canopy base height (CBH), and canopy bulk density (CBD). A single, gradient boosting machine (GBM) model using data from all landscapes is able to characterize these relationships with only small reductions in model performance (mean 0.04 reduction in R 2 ) compared to local GBM models trained on individual landscapes. Model evaluations on independent LiDAR datasets show the single global model outperforming local models (mean 0.24 increase in R 2 ), indicating improved model generality. The global GBM model signiﬁcantly improves performance over existing LANDFIRE canopy fuels data products (R 2 ranging from 0.15 to 0.61 vs. − 3.94 to − 0.374). The ability to automatically update canopy fuels following wildﬁre disturbance is also evaluated, and results show intuitive reductions in canopy fuels for high and moderate ﬁre severity classes and little to no change for unburned to low ﬁre severity classes. Improved canopy fuel mapping and the ability to apply the same predictive model on an annual basis enhances forest, fuel, and ﬁre management.


Introduction
Characterization of forest structure remains a priority for a variety of scientific research and land management objectives [1][2][3][4][5][6]. Forest management across the Earth now relies on spatial data derived from remote sensing to inform policy and decision making [7]. For example, the LANDFIRE project produces nationally consistent and comprehensive forest canopy fuel maps for the US from Landsat satellite imagery and facilitates landscape-scale management including fuel and restoration treatment planning and assessment [8][9][10][11], prescribed fire planning and implementation [12], and wildfire prediction, suppression, impact mitigation, rehabilitation, and assessment [10,[13][14][15].
For wildfire management in particular, LANDFIRE provides the spatial data for fire models that predict the spread and intensity of wildfires [16]. These predictions are essential for wildfire risk assessments, which are increasing in necessity because strategic and tactical fire management decisions are now called to be explicitly risk based [14,17]. Wildfires are increasing in size and severity with unprecedented destruction and associated costs [18][19][20]. Climate change is set to increase aridity across the western US with expected increases to fire disturbance [21,22]. Accurate, up-to-date, and comprehensive datasets are thus needed for effective fire management and could prevent loss of life and property and promote ecosystem health.
To create these broad spatial data, standard approaches utilize field measurements, which provide explicit measurements of attributes [23]. These are then related to satellite imagery thereby creating spatially-complete datasets [24]. Such assessments have risen in scale from local [25] to regional [26] and global assessments [27] in parallel with the rise in image availability, computing performance, and analytical sophistication. However, traditional field-based approaches are often labor intensive, do not capture the range of variation on the landscape, have inconsistent data collection methods, and cannot feasibly capture many sought-after three-dimensional (3D) attributes [28,29].
Light detection and ranging (LiDAR), a now pervasive active remote sensing technology, combined with a scanning system (airborne laser scanner (ALS)), has provided large-area 3D datasets and facilitated the conceptualization of new forest attributes and statistical relationships to established forest metrics [30,31]. Field data have been crucial to the development of these attributes, but the consistency and accuracy of LiDAR allows the application of statistical relationships for forest attribute prediction over a diversity of geographic areas and reasonably similar forest types [32,33]. However, the spatial and temporal scales of these LiDAR data and existing predictive models are still limited for a variety of management needs. LiDAR data are often incomplete, out-of-date, or not available, and predictive models derived from these data are often overfit to specific landscapes and forest types [33].
As satellite imagery provides the necessary spatial and temporal coverage and LiDAR provides accurate 3D characterization, many studies have harnessed their complementary strengths and fused them to create comprehensive spatial datasets [34]. LiDAR metrics have been used directly as the response variable (e.g., canopy cover and height) or indirectly as the predictor variable of a modeled response (e.g., biomass, basal area, and Lorey's height). Correlations of satellite imagery to basic attributes characterizing vegetation coverage fractions, such as forest canopy cover, are well documented, especially from the series of Landsat satellites [35], but the ability of satellite imagery to characterize complex forest attributes, such as canopy height and height variability [36][37][38][39][40][41][42], basal area [43], biomass [34,[44][45][46], and other measures of structural complexity [47][48][49] has come relatively recently and are aided by LiDAR data. Though the potential for these types of characterizations using imagery has been exploited for several decades [50], the ubiquity of LiDAR datasets with large numbers of samples and accurate 3D characterizations have enabled more robust assessments over broad areas [49].
Canopy cover (CC) and canopy height (CH) can be directly measured by LiDAR while canopy bulk density (CBD) can be estimated using regression equations utilizing LiDAR-derived CH and CC as predictor variables [51,52]. Andersen et al. [53] developed multi-variate models to estimate canopy fuel parameters specific to the Pacific Northwest region of the US and Peterson et al. [52] adopted generalized canopy base height (CBH) and CBD conceptualizations. For CBH, many formulations exist including using LiDAR metrics as direct surrogates. For example, 1st percentile LiDAR heights [54], 25th percentile LiDAR height [55], and one standard deviation (SD) of LiDAR heights subtracted from mean LiDAR height [52,56]. These LiDAR-based estimates of CBH and CBD capture the spatial variation in 3D canopy structure important for fire modeling. While different in their conception from field-based estimates, the absolute values of these variables are not as important as representing the structural variability across the landscape because fire behavior modelers often need to calibrate canopy fuel values to match observed fire behavior to fire model outputs [9,57].
In conjunction with LiDAR-based training data, the rise of machine learning techniques has also contributed to the increase in correlative and predictive power for remote sensing [58]. Mousivand et al. [59] showed that the sum of second-order interactions was greater than first-order canopy effects on spectral reflectance at the scales and wavelengths of Landsat imagery. Machine learning algorithms have the ability to characterize these complex relationships given appropriate training data. Gradient boosting machines (GBMs) are one of these algorithms now applied in remote sensing and ecological Remote Sens. 2020, 12, 1000 3 of 37 sciences [60]. They combine the advantages of tree-based algorithms with a boosting approach that adaptively combines many simple regression trees [61,62].
Predictive models capable of reasonable accuracy across a diversity of landscapes have been a research priority for remote sensing and forestry applications [32,33,46]. The inability of models derived from localized datasets to apply to other landscapes stems from three primary and related issues: model overfitting [33,63], mis-specified predictor variables not fully characterizing the response, and spatial and temporal variance in the feature-response relationships-in this case, the LiDAR-Landsat relationships [49]. These issues are compounded when integrating collections of ALS datasets acquired from different time periods and locations.
This study aims to address these issues while developing methods to produce comprehensive and accurate canopy fuels maps for the western US. Model overfitting is a common issue for algorithms such as GBMs [60,64]. Several strategies have been developed to increase model generality. These regularization techniques include hyperparameter tuning [65], which varies model parameters and utilizes a stopping metric to reduce overfitting. Sample weighting or balancing is also an important regularization technique [66,67] given the arbitrary spatial and temporal extents of ALS datasets. Forest landscapes can have sample distributions that are heavily skewed or have high kurtosis in one or more canopy variables, which can bias model predictions. For large-area predictions, spatial and temporal variance in the LiDAR-Landsat relationships also leads to poor model performance if there is inadequate training data [32,46,68]. One solution is to create many localized models from each individual LiDAR dataset and stitch the predictions together. However, determining boundaries where each local model is most applicable would be difficult. Conversely, a single global model can be used with either sufficient generality or including locational predictor variables to correct for spatial variance [47,49]. A select set of models, each trained on biophysically stratified data, may represent a compromise between these two extremes.
In this study, canopy fuels are estimated from LiDAR datasets and used as training data for GBM models using Landsat spectral data for predictor variables. Model predictions are then compared to current LANDFIRE data to quantify improvements due to LiDAR training data, GBM modeling, and regularization techniques. The research has three objectives: (1) Create and compare local, biophysically stratified, and global predictive model(s) of canopy fuels variables than can be applied to forested areas in the western US; (2) Compare model predictions to current LANDFIRE products; (3) Assess selected model(s) ability to update canopy layers following wildfire disturbance.

Data
LiDAR datasets were selected based on availability and to represent the diversity of conifer forests, climates, and disturbance regimes in the western contiguous US (Figure 1). LiDAR acquisitions were ultimately grouped into sixteen landscapes for analysis based on proximity and vegetation similarities. Hereafter, each set of grouped LiDAR acquisitions is referred to as a landscape dataset. Thirteen landscapes were used for model training, validation, and testing (training landscapes), while three landscapes were used exclusively for model testing (test landscapes). All LiDAR data acquisition parameters followed, at a minimum, the requirements for the US Geological Survey's Quality Level 1 [69]. Important collection parameters include ≥ 3 returns/pulse, ≥ 8 returns/m 2 , and a relative vertical accuracy ≤ 0.06 m root-mean-square deviation (RMSD). LiDAR data totaled 1,258,993 ha for training and validation and 265,225 ha for testing. Landscape elevations range from 0 to 3599 m a.s.l. and precipitation normals range from less than 200 mm to over 4000 mm annually [70]. At least one dataset is within each EPA Level I forest ecoregion of the western US ( Figure 1). Remote Sens. 2020, 12, 1000 4 of 37 could apply to many forest types in the western US. For CBH, LiDAR mean height of canopy points minus one standard deviation of heights provided a characterization with adequate correlation to field-based estimates at multiple study sites (R² = 0.547) [52]. For CBD, we used an established equation derived from field data and inserted LiDAR-derived CH and CC values [51,52]. LiDAR data were filtered for a minimum canopy height of 2 m, canopy cover of 2%, and the LANDFIRE existing vegetation type (EVT) classified as a forest type to ensure that non-forested pixels did not contaminate samples.

LiDAR Processing
LiDAR point clouds were processed through FUSION software [71] to extract height above ground and calculate metrics to match the 30 m Landsat cell resolution. All subsequent analyses were conducted at the 30 m cell resolution. Table 1 shows the formulation of the canopy fuel metrics from LiDAR. CH, CC, and CBH are defined directly from the LiDAR metrics. For CBH and CBD estimation, multiple possible formulations exist [52], and we chose parsimonious definitions that could apply to many forest types in the western US. For CBH, LiDAR mean height of canopy points minus one standard deviation of heights provided a characterization with adequate correlation to field-based estimates at multiple study sites (R 2 = 0.547) [52]. For CBD, we used an established equation derived from field data and inserted LiDAR-derived CH and CC values [51,52]. LiDAR data were filtered for a minimum canopy height of 2 m, canopy cover of 2%, and the LANDFIRE existing vegetation type (EVT) classified as a forest type to ensure that non-forested pixels did not contaminate samples.

Landsat and LANDFIRE Data Processing
Landsat indices (Table 1) were calculated for the contiguous US for the period of 2000-2016 using May to October imagery at an annual timestep. Median and maximum values of each index for each year were calculated using US Geological Survey Landsat 5 TM, Landsat 7 ETM+, and Landsat 8 OLI surface reflectance products created using the Landsat Ecosystem Disturbance Adaptive Processing (LEDAPS) algorithm [80,81]. The use of both median and maximum values followed studies, such as Egorov et al. [82], that showed multi-temporal metrics derived from annual composites enhanced predictive performance for forest canopy models and also followed the hypothesis that multi-temporal imagery can capture differences in shadowing (due to changing sun-angles) related to tree height [83]. Landsat TM and ETM+ were merged and constituted the data source for the period of 2000-2014, and OLI data was used for the period of 2015-2016. Equations for spectral calculations are within the citations in Table 1. Pixels were filtered for contamination by clouds, cloud shadows, and adjacency to clouds, snow, and water. All Landsat data was processed within Google Earth Engine, exported, and then mosaicked into multi-band rasters organized by year.
Topography and geographic metrics of slope, aspect, elevation, and latitude were taken from LANDFIRE [16]. Fire regime groups (FRGs) [79] and existing vegetation type (EVT) were also extracted along with LANDFIRE's existing canopy fuel layers CC, CH, CBH, and CBD [51] for comparison to model outputs.

Dataset Stratification, Sample Weighting, and Model Development
All subsequent data processing and model development were completed using R statistical software in conjunction with the Apache Spark analytics engine. Spark is an open-source, parallel, scalable, and resilient Big Data processing environment [84]. H2O machine learning algorithms were employed within R and Spark using the R packages h2o [85], sparklyr [86], and rsparkling [87]. Data preprocessing, stratification, visualization, and weighting were completed using the R packages raster [88], ggplot2 [89], and dplyr [90].
Three data stratification approaches were devised, leading to three sets of models ( Figure 2)-local models, models stratified by fire regime groups (FRGs), and a global model taking data from all landscapes. For the local models, training, validation, and test data were taken from within each individual landscape ( Figure 1) and a separate model derived for each. Recognizing that these local models are likely overfit to their respective landscapes, they represent the baseline accuracy or maximum potential extractable information that subsequent, more generalized models can be compared to. Next, we hypothesized that the nature of the spectral relationships may vary across forest types and separate models stratified by a biophysical rationale may better characterize these relationships. Fire regime groups produced by LANDFIRE represent an integration of existing and potential vegetation, climate, topography, and disturbance regimes [79]. Forests within the same FRG classifications were expected to have similar structural and species assemblages and therefore more consistent spectral responses. For forested areas, four fire regime groups were present within the study area: year fire return interval, any severity. FRG 5 represents a diverse category including multiple vegetation types across the US but in the western US and for the landscapes in this study, FRG 5 represents the western Cascade Mountains and coastal forests in the Mt. Baker, Hoh, North Coast, and South Coast landscapes. All pixels representing a particular FRG from all landscapes were binned and used to construct a model.
Finally, a global model containing all data from all landscapes was created as a parsimonious option requiring no stratification. However, with the significant differences in dataset sizes between landscapes (cf. Figure 1), predictor-response relationships in large datasets may dominate the smaller datasets. Thus, landscape datasets were randomly sampled so that each landscape's contribution did not exceed 20% of the total sample size for the global model. This rule was also applied to the FRG models for the same purpose, but the proportion was adjusted to 30% for FRG 4 and FRG 5 because only a few landscapes contained enough data for training.
An additional sample weighting scheme was applied for every model to improve generality and the ability to characterize post-disturbance canopy fuels on any landscape. For each of the four canopy response variables (CC, CH, CBH, CBD), the range of values was assessed and then split into ten even-sized bins (e.g., CC: 1-10%, 11-20%, 21-30%, etc.). The bins were ordered by sample size, and the samples within the bin with the highest count all received a weighting value of one (i.e., would be considered once in the model). The other bin's samples were then multiplied by the necessary weighting value so that the sum of the sample weights was equal across all bins. This effectively gave an equal weighting to the full range of values, helping to prevent model bias due to skewed or leptokurtic training data distributions. Several classes for certain models had very few samples within a class which led to extreme weighting values to few samples (> 100,000x weight in one case). Therefore, the maximum weight for any sample was set to 100x to reduce model instability created by extreme weighting.
Remote Sens. 2020, 12, x FOR PEER REVIEW 7 of 38 Figure 2. Flow diagram of model development and testing. See Table 1 for detail on response and predictor variable formulation and the Materials and Methods section for detail on all else.  Table 1 for detail on response and predictor variable formulation and the Materials and Methods section for detail on all else. Once stratification and weighting were completed, model development was then possible. From each model's sample pool, 80% were selected for training, 10% for validation and internal error estimation for hyperparameter tuning, and 10% for testing and performance assessment. However, these testing data's acquisition parameters, especially the year of acquisition, and the geographic location and vegetation type are still related to the particular landscape from which they were derived. Thus, three separate LiDAR datasets without any data used in the model training process were used as a final assessment of the local, FRG, and global model performance ( Figure 1, red landscapes). Table 2 shows the GBM model parameters including those varied during hyperparameter tuning. For more explanation of each parameter see H2O.AI GBM documentation [91]. A maximum of ten models were trained before selecting each final model, and the various parameter options were randomly selected for each model run. Final model selection used minimum RMSE as the selection metric.

Spectral Response and Model Performance Assessment
In order to assess the variance in the relationships between the spectral predictor variables and the canopy fuel response variables, partial dependence plots were calculated for the local, FRG, and global models. Partial dependence plots assess the marginal effect of a single predictor variable on the response variable [62]. The major assumption is that the predictor variables are independent of each other. While the GBM model is resistant to multi-collinearity, if a predictor is highly correlated with another, then data combinations created from the predictor distributions can be highly unlikely in reality and result in biased partial dependence estimates. Thus, local, FRG, and global models were first trained on a reduced subset of predictor variables for creation of partial dependence plots. The reduced set included only the median spectral index values as the median and maximum spectral indices were highly correlated for most predictors (Pearson r = 0.7-0.9).
For final model testing and comparison, models were then trained on the full set of predictor variables. Root mean squared error (RMSE), mean absolute error (MAE), and the coefficient of determination (R 2 ) were used for evaluation. For the independent datasets (Illilouette, North Coast, and Slate Creek), the nearest local model predictions were used for comparison to the FRG and global model predictions. Predictions made on the independent datasets were also compared to existing LANDFIRE canopy metrics.
Model predictions should be able to accurately account for disturbance effects on canopy fuels, especially those due to wildfire. Within the spatial extents of the LiDAR datasets, we searched for wildfires that occurred after the LiDAR acquisition date but before 2016 so post-fire imagery the Remote Sens. 2020, 12, 1000 9 of 37 following growing season could be attained. Spectral indices for one year before the fire and one year after the fire were calculated. Burn severity maps from the Monitoring Trends in Burn Severity project [92] were acquired and model predictions were stratified by high, moderate, and unburned to low burn severity classes. The global model and local model pre-fire and post-fire predictions were compared to evaluate whether predictions followed post-fire expectations of change. CC, CH, CBH, and CBD were all hypothesized to decrease most in the high-severity classes, with the unburned to low class having little change.

Results
The partial dependence plots show non-linear predictor-response relationships. The canopy fuels exhibited relatively consistent response shapes but showed shifts in absolute values among landscapes (Figures 3 and 4). For example, for the normalized difference vegetation index (NDVI)-CC relationship (top left, Figure 3), the mean response for a median NDVI value of 0.6 is~40% CC for the Ochoco landscape and~80% for the Garcia landscape. CBH and CBD had more inconsistent response shapes between landscapes overall compared to CC and CH. The global model compared favorably to the local models in the performance assessment ( Table 3). The complete breakdown of each landscape including FRG model accuracies are shown in Table A1. The following mean change in performance metrics are averages of local-global comparisons, with equal weighting given to each landscape regardless of geographic area or sample size. For CC, the use of the global model increased error by 0.08% and 0.11% for RMSE and MAE, respectively, and increased R 2 by 0.004. R 2 decreased slightly with the global model for all landscapes, except for the Garcia and Hoh landscapes (Table A1)  . Partial dependence plots for each predictor variable (x-axes) and the canopy cover and canopy height response variables (y-axes). Each line shows the partial dependence using the local (colored) and global (black) landscape datasets.  . Partial dependence plots for each predictor variable (x-axes) and the canopy base height and canopy bulk density response variables (y-axes). Each line shows the partial dependence using the local (colored) and global (black) landscape datasets. . Partial dependence plots for each predictor variable (x-axes) and the canopy base height and canopy bulk density response variables (y-axes). Each line shows the partial dependence using the local (colored) and global (black) landscape datasets.
Remote Sens. 2020, 12, x FOR PEER REVIEW 12 of 38 Figure 5. Partial dependence plots for each predictor variable (x-axes) and the canopy cover and canopy height response variables (y-axes). Each line shows the partial dependence using the fire regime group (FRG; colored) and global (black) landscape datasets. . Partial dependence plots for each predictor variable (x-axes) and the canopy cover and canopy height response variables (y-axes). Each line shows the partial dependence using the fire regime group (FRG; colored) and global (black) landscape datasets.
Remote Sens. 2020, 12, x FOR PEER REVIEW 13 of 38 Figure 6. Partial dependence plots for each predictor variable (x-axes) and the canopy base height and canopy bulk density response variables (y-axes). Each line shows the partial dependence using the fire regime group (FRG; colored) and global (black) landscape datasets.
The global model compared favorably to the local models in the performance assessment ( Table  3). The complete breakdown of each landscape including FRG model accuracies are shown in Table   Figure 6. Partial dependence plots for each predictor variable (x-axes) and the canopy base height and canopy bulk density response variables (y-axes). Each line shows the partial dependence using the fire regime group (FRG; colored) and global (black) landscape datasets. The performance comparisons of individual landscapes to the global model predictions largely follow this same trend of slight increases in error with global model use (Table A1). The independent test landscapes flip this trend, however, with the global modeling performing better in 9 of 12 (75%) landscape-response variable combinations (Table A2). The starkest difference is in the Slate Creek landscape for CBD, where the R 2 using the nearest local model (Clear Creek) was −0.889 and use of the global model increased the R 2 to 0.479. The South Coast model performed better than the global model in the North Coast landscape for CH, CBH, and CBD. The FRG models also performed better than the nearest local models in most comparisons, on par or near to global model performance. The performance of the global models on the test landscapes is further highlighted in the predicted versus observed graphs (Figures 7-9).
The predicted versus observed graphs also show the increase in performance compared to the existing LANDFIRE data. The global model reduced RMSE on average by 11.3% for CC, 5.45 m for CH, 5.78 m for CBH, and 0.062 kg/m 3 for CBD compared to LANDFIRE. The LANDFIRE data had little ability to characterize canopy fuels in general with negative R 2 values for 8 of 12 (66.7%) LANDFIRE-response variable combinations. The global model performed worse on the test landscapes compared to the training landscapes, but still produced a mean R 2 of 0.439 overall compared to a mean R 2 of −1.375 for LANDFIRE. The global model performed comparably to the training landscapes for the Illilouette and North Coast landscapes except for Illilouette CBH, which had a mean R 2 of −0.094. For Slate Creek, R 2 values were noticeably worse than training landscapes, with a mean R 2 of 0.372.
For comparisons between response variables using the global model, CBH proved the most difficult to characterize with a mean R 2 of 0.547 for the training landscapes and 0.153 for the test landscapes. CC had the best model performance overall, with a mean R 2 of 0.730 for the training landscapes and 0.611 for the test landscapes. CBD had a mean R 2 of 0.602 for the training landscapes and 0.507 for the test landscapes. CH had a mean R 2 of 0.631 for the training landscapes and 0.385 for the test landscapes.
The Ochoco landscape contained the Corner Creek Fire mostly within its extent with an ignition date of 29 June 2015. Pre-fire spectral indices were calculated using 2014 imagery and post-fire with 2016 imagery. Model predictions for both years using the global model and local (Ochoco) landscape model were compared (Figures 10-13). High-burn severity pixels showed the largest changes in canopy predictions with a mean percent decrease of 51.2% for CC (     model were compared (Figures 10-13). High-burn severity pixels showed the largest changes in canopy predictions with a mean percent decrease of 51.2% for CC (55.6% to 27.13%), 39.4% for CH (28.0 to 16.9 m), 40.3% for CBH (8.3 to 5.0 m), and 55.0% for CBD (0.135 to 0.061 kg/m 3 ) using the global model. The local Ochoco model showed similar results but the percent changes were larger for CC (60.0% decrease), smaller for CH (14.9% decrease), larger for CBH (46.2% decrease), and smaller for CBD (52.2% decrease).    Moderate-burn severity pixels followed expectations with mean decreases for each canopy variable but at reduced percentages compared to high-burn severity pixels. For the global model, CC decreased by 31.7%, CH decreased by 20.0%, CBH decreased by 26.3%, and CBD decreased by 32.5%. For the Ochoco local model, CC decreased by 38.0%, CH decreased by 1.2%, CBH decreased by 29.5%, and CBD decreased by 32.8%.
For unburned to low-burn severity pixels, little change was seen in the canopy fuel variables. For the global model, CC had a mean percentage decrease of 5.8%, CH decreased by 2.3%, CBH decreased by 5.3%, and CBD decreased by 2.5%. For the local Ochoco model, CC decreased by 7.9%, CH increased by 3.8%, CBH decreased by 4.5%, and CBD decreased by 5.1%. Figure 14 shows the variable importance for the four canopy fuel variables using the global model. The spectral predictors Med NBR, Med Bright, and Med Green each had the highest importance for at least one variable. Med NBR had the highest importance for CC and CBH and was also of secondary importance for CH and CBD. Med Green had low importance for CC, CH, and CBH but was the highest for CBD. This was an unexpected result considering CBD is calculated using CC and CH, and Med Green had low variable importance for these two variables. Latitude, elevation, and aspect had moderate importance for three of the four canopy variables excluding CBD. The median spectral indices had higher importance than all their maximum counterparts. Tasseled cap wetness had the least importance with the maximum and median indices in the bottom Moderate-burn severity pixels followed expectations with mean decreases for each canopy variable but at reduced percentages compared to high-burn severity pixels. For the global model, CC decreased by 31.7%, CH decreased by 20.0%, CBH decreased by 26.3%, and CBD decreased by 32.5%. For the Ochoco local model, CC decreased by 38.0%, CH decreased by 1.2%, CBH decreased by 29.5%, and CBD decreased by 32.8%.
For unburned to low-burn severity pixels, little change was seen in the canopy fuel variables. For the global model, CC had a mean percentage decrease of 5.8%, CH decreased by 2.3%, CBH decreased by 5.3%, and CBD decreased by 2.5%. For the local Ochoco model, CC decreased by 7.9%, CH increased by 3.8%, CBH decreased by 4.5%, and CBD decreased by 5.1%. Figure 14 shows the variable importance for the four canopy fuel variables using the global model. The spectral predictors Med NBR, Med Bright, and Med Green each had the highest importance for at least one variable. Med NBR had the highest importance for CC and CBH and was also of secondary importance for CH and CBD. Med Green had low importance for CC, CH, and CBH but was the highest for CBD. This was an unexpected result considering CBD is calculated using CC and CH, and Med Green had low variable importance for these two variables. Latitude, elevation, and aspect had moderate importance for three of the four canopy variables excluding CBD. The median spectral indices had higher importance than all their maximum counterparts. Tasseled cap wetness had the least importance with the maximum and median indices in the bottom four of importance. Max NDVI and Max NBR were the other two predictors with the lowest importance overall. four of importance. Max NDVI and Max NBR were the other two predictors with the lowest importance overall.

Discussion
The global models proved most suitable for predicting canopy fuels over the western US. The global models performed nearly as well as the local models for each training landscape and out-performed the nearby local models on the independent test datasets. In a few instances, the local model failed entirely to characterize fuels on the test landscape. For example, in the Slate Creek test landscape, the Clear Creek local model had a -0.889 R², while the global model had a 0.479 R² for CBD. Perhaps a more representative local landscape could have produced improved predictions, but additional analysis would be required to determine suitable models. This highlights the primary utility of a global GBM model, the ability to apply one set of models to produce broad-area predictions. The fire regime models have nearly equivalent performance to the global models, but data stratification and additional separate models are unnecessary given the global model performance.

Discussion
The global models proved most suitable for predicting canopy fuels over the western US. The global models performed nearly as well as the local models for each training landscape and out-performed the nearby local models on the independent test datasets. In a few instances, the local model failed entirely to characterize fuels on the test landscape. For example, in the Slate Creek test landscape, the Clear Creek local model had a −0.889 R 2 , while the global model had a 0.479 R 2 for CBD. Perhaps a more representative local landscape could have produced improved predictions, but additional analysis would be required to determine suitable models. This highlights the primary utility of a global GBM model, the ability to apply one set of models to produce broad-area predictions. The fire regime models have nearly equivalent performance to the global models, but data stratification and additional separate models are unnecessary given the global model performance.
The global model may show increased sensitivity to disturbance as well. The analysis of pre-and post-fire predictions showed intuitive reductions in canopy fuels corresponding to burn severity classes, which implies the model could be applied on an annual basis to maintain updated canopy fuels maps. The local Ochoco model predicted almost no change in canopy height in moderate-severity fire classes (mean 1.2% decrease) compared to global model estimates of a 20% mean decrease ( Figure 11). This could add additional value and supplement ongoing improvement of LANDFIRE data products [93].
Without post-fire LiDAR or field data, the results of the fire disturbance analysis must be considered encouraging but unsubstantiated.
The GBM boosting approach is likely able to differentiate the areas of variance in the predictor-response relationships observed in the partial dependence plots. In effect, a single GBM can create localized models within the ensemble process, as samples with different feature-response relationships would show as residuals that GBM then focuses on in subsequent trees. The use of latitude and its moderate importance for 3 of 4 of the canopy fuel variables supports this assertion as spatial variance in spectral-canopy fuel relationship can be captured within this locational metric. While longitude was considered as a predictor variable and a version of the global model containing longitude was evaluated on the test landscapes, model performance was not significantly improved and showed performance reductions in several cases. With more adequate dataset coverage, inclusion of longitude would likely improve global model predictions and should be included then.

Comparisons
Comparisons to LANDFIRE canopy fuel products show increased performance in every case. In tandem with GBM and multi-temporal Landsat data, the consistent characterization of structure, geographic diversity of datasets, and large sample size made available by LiDAR all provide added value. However, these LiDAR-based metrics are derived differently than LANDFIRE-based canopy fuel metrics. Indeed, ambiguity in the formulation of CBH and CBD, whether field or LiDAR based, exists in general and partially stems from the inability to consistently measure these metrics in the field [94]. LiDAR CC and CH metrics can be inserted directly into the LANDFIRE CBD equation for near equivalency (Table 1). However, this equation was only one of two used to calculate CBD in Reeves et al. [51] but determining which equation was applied at the pixel level is not possible and could explain part of the mismatch in the comparisons here. CBH's LiDAR-based definition (Table 1) is not equivalent to LANDFIRE's formulation, which is the lowest vertical height at which the vertical distribution of CBD is ≥ 0.012 kg/m 3 [95], though there is conceptual correspondence. As Peterson et al. [52] notes, the LiDAR-derived definition of CBH overpredicts compared to Reinhardt et al. [95] but represents a directly measured and parsimonious characterization of the vertical distribution of canopy biomass. A bias correction would be necessary for fire modeling applications because overprediction of CBH leads to underprediction of crown fire in operational fire models [96]. The best interpretation for application is that these remotely sensed canopy fuel layers relate to and covary in a similar fashion but do not replicate the canopy fuel variables as originally conceived for the fire models.
Direct comparisons of findings to other research are difficult given the breadth of the study area, the use of LiDAR-derived canopy fuel-specific response variables, and the diversity of performance metrics used in the literature. Matasci et al. [49] predicted forest canopy variables over the entirety of the Canadian boreal shield and reported an R 2 of 0.495 for CH and 0.612 for CC. Hansen et al. [42] used space-borne LiDAR and Landsat ETM+ and OLI data to predict tree height across Sub-Saharan Africa and reported an MAE of 2.45 m. They reported an MAE of 4.65 m for tree heights > 20 m though, which is more consistent in error and observed tree heights presented here (MAE 3.99 m for training landscapes and 5.78 m for test landscapes). Wilkes et al. [41] also mapped canopy heights over a broad-area in Australia and reported an RMSE of 5.6 m. Ahmed et al. [40] focused on a 2600 ha area in British Columbia and reported an R 2 of 0.67 for CC and 0.82 for CH. Stojanova et al. [39] studied Slovenian forests and reported an RMSE of~14.7% for CC and~2.1 m for CH. Hyde et al. [37] utilized a dataset near the Dinkey landscape from this study and reported similar results (Dinkey-specific results from this study in parentheses): for max height, an R 2 of 0.712 (0.675) and an RMSE of 9.6 m (6.88 m); for mean height, an R 2 of 0.603 and an RMSE of 7.5 m; for SD of heights, an R 2 of 0.517 and an RMSE of 3.7 m (Dinkey CBH R 2 is 0.508 and RMSE 2.73 m). Pascual et al. [38] reported an R 2 of 0.62 for mean height and 0.66 for height coefficient of variation (CV) using Landsat imagery in Spain. Erdody and Moskal [97] used field-based estimates of canopy fuels and related them to Landsat spectral indices in south-central WA and reported R 2 values of 0.415 for canopy height, 0.309 for CBH, and 0.602 for Remote Sens. 2020, 12, 1000 24 of 37 CBD. While some of these studies produce better results when comparing to the test landscapes held entirely out from model training, the test data taken from the training landscapes are the more accurate and competitive comparisons as none of these studies use entirely independent LiDAR datasets from different landscapes.

Relevance of Predictors and Predictor-Response Spatiotemporal Variance
While differences in LiDAR acquisition parameters, especially point density, can be small for height and cover [98], these differences likely contribute a source of error especially for CBH, which requires laser penetration into the canopy [55]. The LiDAR data was also acquired from 2009 through 2015 and annual differences in the spectral indices affect model development and performance. The differences in the partial dependence plots among landscapes are a combination of site-specific (spatial) and temporal effects. The temporal variation is driven primarily by precipitation timing and amount, temperature fluctuations, and variable cloud and cloud shadow cover leading to inconsistencies in the median and maximum spectral indices [99,100]. The consistency in the model predictions for the unburned to low-severity portions of the Corner Creek Fire using predictor variables derived from different years and different sensors (TM/ETM+ in pre-fire 2014 and OLI in post-fire 2016) supports the use of the global model over time (Figures 10-13). Additional assessment is necessary to characterize the model's ability to predict over time, but the results here imply a level of harmonization using these spectral indices without necessitating extensive image calibration.
As seen in the results, the spatial variance in the predictor-response relationships varied depending on the particular predictor and response variable. The relationships that typically had the most consistency (e.g., TC Brightness and CH, Figure 3) also had the highest variable importance (Figure 14). Differences in species composition, both in the canopy and the understory, in combination with bidirectional reflectance and solar zenith angle effects likely caused a majority of the differences among landscapes [46,101]. In addition, the range and distribution of canopy fuel values present in the dataset had a substantial effect. Even with balancing of the datasets before modeling, low sample sizes and skewed distributions can bias partial dependence plots and model predictions, a case of 'regression to the mean' [102]. For example, the Garcia landscape showed little sensitivity to any spectral indices for CC because 75% of samples were above 87.9% CC.
While topographic and locational features certainly aid in the modeling process, they are static through time and models based primarily on these types of data will not be as responsive to forest structure change. For example, Matasci et al. [49] achieves relatively high accuracy in prediction across Canada, but four of the top five predictors in terms of variable importance are either topographic or locational in nature (elevation, latitude, longitude, and slope). This suggests those model predictions may produce similar predictions over time regardless of changes in forest canopy structure. In this study, elevation, aspect, and latitude show moderate variable importance, but three spectral indices are considered much more important. For CC and CBH, Med NBR had the highest importance, and Med Bright and Med Green had the highest importance for CH and CBD, respectively. This implies that the model is sensitive to spectral change; an assertion supported by the results from the partial dependence plots and the Corner Creek Fire on the Ochoco landscape, which shows large reductions in canopy fuels in high-burn severity pixels, moderate change in moderate-burn severity pixels, and little change in low to unburned pixels. Although validation data are not available for the wildfire assessment, the changes predicted are logical and follow established definitions of fire severity.

Potential Improvements and Future Work
Despite the large reductions in fuels shown from the Corner Creek Fire, however, the models are potentially underestimating the amount of change and would not predict an ecosystem type change. As noted in multiple modeling studies at these scales [42,102] and seen in the predicted vs. observed plots here (Figures 7-9), predictions tend to overestimate at the low end and underestimate at the high end of the value ranges. In addition, the models were only trained with forested samples and thus have no ability to predict ecosystem type changes to grass-or shrub-dominated areas. A separate step in a complete algorithm could delineate forest or non-forest ecosystem types before applying canopy fuel models as done in LANDFIRE [16].
Though Landsat TM and ETM+ sensors are practically equivalent, raw Landsat 8 OLI images require transformations for continuity [103]. While the tasseled cap transformations were designed to maintain continuity among sensors [77], NDVI and NBR do not have such corrections considered. While Roy et al. [103] shows a small but significant difference in NDVI between ETM+ and OLI sensors, no such assessment has been performed for NBR, and slight differences are expected. Correcting for differences in these two normalized metrics could potentially improve temporal continuity and minimize differences in the imagery datasets and model predictions.
The tasseled cap transformations themselves were designed for top-of-atmosphere (TOA) reflectance but applied to the surface reflectance products here. Inconsistencies in the literature are present with multiple approaches used but in general, the application of TOA transformations has been successful with surface reflectance products. Indeed, the scenes used to develop the transformations were specifically chosen to have little atmospheric contamination [76][77][78]. DeVries et al. [104] argue for the use of one set of coefficients for use with surface reflectance products from multiple sensors and apply those of Crist [76], which were designed for Landsat-5 TM data. Kennedy et al. [105] use the same logic but first applied a scene normalization algorithm to make the spectral space relatively consistent. As Baig et al. [78] and Huang et al. [77] designed the Landsat-7 ETM+ and Landsat-8 OLI transformations for temporal continuity with previous sensors, the transformations specific to each sensor are used here ( Table 1).
Addition of time-series metrics derived from multiple years of Landsat data may also improve model predictions (use of time-series data reviewed by Banskota et al. [106]). The primary benefit in this case would be the ability to identify previous disturbance timing and severity and potentially help stabilize the spectral indices' year-to-year variability caused by cloud and shadow contamination or precipitation variability. Although Zald et al. [47] and Matasci et al. [49] employ these sophisticated metrics, their variable importance was minimal in their final assessments. A major downside is the need to re-calculate the temporal trends annually, which can be computationally expensive and would reduce the utility of a single trained model that can make predictions on any annual composite imagery regardless of year of acquisition. Initial tests using time-series metrics on a single landscape were not promising, though there is still potential for integrating these metrics to improve model predictions.
Additional complex topographic, climatological, weather, and energy balance variables could also improve model performance if properly implemented. For example, the topographic wetness index [107] and climatic water deficit [108] are variables that influence forest development. However, these features describe the environmental template on which vegetation and disturbance act upon and their utility for characterizing the current state of canopy fuels is limited and may lead to overfitting. The use of gridded temperature and precipitation data may also be useful but with similar caveats. Additional LiDAR datasets would also likely improve the model's ability to predict in new areas. In this study, only one dataset is present for the state of CO, one for AZ, and none for NM, UT, WY, SD, and NV. The dataset balancing techniques used here would properly integrate the addition of large and small LiDAR acquisitions. With the method's focus on model generality and the data processing architecture utilized, significantly more data can be added, and predictions will improve with dataset additions. Integration of field-based datasets covering large geographic areas, such as those utilized for LANDFIRE [51], could enhance model development and validation as well. The continued development of LiDAR-based fuel metrics, especially CBD and CBH, could utilize these field data as well.
Finally, a necessary subject of future research is to develop more consistent methods to predict surface fuel models to accompany the improved canopy fuels data produced in this study, because changes in surface fuel models generally have a stronger effect on predicted fire behavior than canopy fuels [109]. Application of rule sets developed as part of the LANDFIRE Total Fuel Change Tool [110] may provide a starting point for predicting surface fuels from improved (and continuous rather than categorical) estimates of canopy properties.

Conclusions
LiDAR-Landsat fusion is a capable replacement for existing LANDFIRE canopy fuel mapping protocols, is more easily implemented, and produces better results. A single GBM global model offers a parsimonious solution with small decreases in performance compared to the use of many local models and does not require logic to determine where each local model is most applicable. Local model partial dependence plots show spatiotemporal variability in predictor-response relationships but the relationships are relatively consistent across the potential range of values. The global model is able to account for these differences and shows increased generality by outperforming local models on independent datasets. The global model is also able to logically update canopy fuels after wildfire disturbance with similar performance compared to the local model. The increased accuracy and better representation of canopy fuel variability over broad areas will increase the ability to predict fire growth and intensity and therefore enhance land management decision making for pre-fire, during-fire, and post-fire activities.  Table A1. Performance assessment of local, global, and fire regime group (FRG) models using training landscapes. In total, 80% of landscape data used for training, 10% for validation, and 10% for testing. N refers to the number of samples used for testing and calculation of performance metrics, which are root mean square error (RMSE), mean absolute error (MAE), and coefficient of determination (R 2 ).