Spatially Explicit Large Area Biomass Estimation: Three Approaches Using Forest Inventory and Remotely Sensed Imagery in a GIS.

Forest inventory data often provide the required base data to enable the largearea mapping of biomass over a range of scales. However, spatially explicit estimates ofabove-ground biomass (AGB) over large areas may be limited by the spatial extent of theforest inventory relative to the area of interest (i.e., inventories not spatially exhaustive), orby the omission of inventory attributes required for biomass estimation. These spatial andattributional gaps in the forest inventory may result in an underestimation of large areaAGB. The continuous nature and synoptic coverage of remotely sensed data have led totheir increased application for AGB estimation over large areas, although the use of thesedata remains challenging in complex forest environments. In this paper, we present anapproach to generating spatially explicit estimates of large area AGB by integrating AGBestimates from multiple data sources; 1. using a lookup table of conversion factors appliedto a non-spatially exhaustive forest inventory dataset (R² = 0.64; RMSE = 16.95 t/ha), 2.applying a lookup table to unique combinations of land cover and vegetation densityoutputs derived from remotely sensed data (R² = 0.52; RMSE = 19.97 t/ha), and 3. hybridmapping by augmenting forest inventory AGB estimates with remotely sensed AGB estimates where there are spatial or attributional gaps in the forest inventory data. Over our714,852 ha study area in central Saskatchewan, Canada, the AGB estimate generated fromthe forest inventory was approximately 40 Mega tonnes (Mt); however, the inventoryestimate represents only 51% of the total study area. The AGB estimate generated from theremotely sensed outputs that overlap those made from the forest inventory based approachdiffer by only 2 %; however in total, the remotely sensed estimate is 30 % greater (58 Mt)than the estimate generated from the forest inventory when the entire study area isaccounted for. Finally, using the hybrid approach, whereby the remotely sensed inputswere used to fill spatial gaps in the forest inventory, the total AGB for the study area wasestimated at 62 Mt. In the example presented, data integration facilitates comprehensiveand spatially explicit estimation of AGB for the entire study area.


Introduction
Forest biomass is defined by [1] as the above-ground portion of live trees per unit area. In addition to widespread use in carbon budget models [2,3,4,5,6], biomass estimates are important for a broad range of applications, including: characterizing forest conditions and processes [7,8,9,10]; estimating forest productivity [11,12,13,14]; modeling impacts of fire and other disturbances [15,16,17,18]; and, for modeling the environmental and economic consequences of energy production from biomass [19,20,21,22]. Monitoring changes to biomass over time has also emerged as an important activity for many of these aforementioned applications [23,24,25].
Biomass estimates may range from local to global scales, and for some regions, particularly tropical forest regions, there are large variations in the estimates reported in the literature [5,26,27,28,29]. Global and national estimates of forest above-ground biomass (AGB) are often aspatial estimates, compiled through the tabular generalization of national level forest inventory data [1, 30,31,32,33,34,35,6]. Methods and data sources for generating spatially explicit large-area AGB estimates have been the subject of extensive research [8,25,28,36,37,38,39].

Biomass Estimation Methods
A variety of approaches and data sources have been used to estimate forest above ground biomass (AGB). A comprehensive review of remote sensing-based estimates of AGB has been completed, categorized by data source: (i) field measurement; (ii) remotely sensed data; or (iii) ancillary data used in GIS-based modeling [40]. Estimation from field measurements may entail destructive sampling [21,41] or direct measurement [42] and the application of allometric equations [43,44,45]. Allometric equations estimate biomass by regressing a measured sample of biomass against tree variables that are easy to measure in the field (e.g., diameter at breast height, height). 607 different equations have been identified in the literature for estimating AGB for tree species growing in Europe [46]. Although equations may be species-or site-specific, they are often generalized to represent mixed forest conditions or large spatial areas [11,47]. Biomass is commonly estimated by applying conversion factors (biomass expansion factors) to tree volume (either derived from field plot measures or forest inventory data) [1, 2,11,30,32,48]. Relationships between biomass and other inventory attributes (e.g., basal area) [49] have also been reported. The use of existing forest inventory data to map large area tree AGB has been explored [8]; conversion tables were developed to estimate biomass from attributes contained in provincial forest inventory data, including species composition, crown density, and dominant tree height. Guidance on the selection, development, and application of appropriate biomass factors and allometric equations for large-scale biomass estimation was provided [29].
Remotely sensed data have become an important data source for biomass estimation. The remotely sensed data types and approaches used for biomass estimation have been summarized [40,50]. Generally, biomass is either estimated via a direct relationship between spectral response and biomass using multiple regression analysis [51], k-nearest neighbor [52], neural networks [53], or through indirect relationships, whereby attributes estimated from the remotely sensed data, such as leaf area index (LAI), structure (crown closure and height) or shadow fraction are used in equations to estimate biomass [12,36,38,54,55,56]. Four different remotely sensed methods for AGB estimation were compared for a test area in western Newfoundland and the relative advantages of the different approaches were assessed, concluding that the choice of method depends on the required level of precision and the availability of plot data [57]. Some methods, such as k-nearest neighbor require representative image-specific plot data, whereas other methods are more appropriate when scenespecific plot data are limited [36].
A variety of remotely sensed data sources continue to be employed for biomass mapping including coarse spatial resolution data such as SPOT-VEGETATION and AVHRR [25,58] and MODIS [12,47,59,60,61]. To facilitate the linkage of detailed ground measurements to coarse spatial resolution remotely sensed data (e.g., MODIS, AVHRR, IRS-WiFS), several studies have integrated multi-scale imagery into their biomass estimation methodology and incorporated moderate spatial resolution imagery (e.g., Landsat, ASTER) as an intermediary data source between the field data and coarser imagery [52,60,62,63]. Research has demonstrated that it is more effective to generate relationships between field measures and moderate spatial resolution remotely sensed data (e.g., Landsat), and then extrapolate these relationships over larger areas using comparable spectral properties from coarser spatial resolution imagery (e.g., MODIS). Following this approach alleviates the difficulty in linking field measures directly to coarse spatial resolution data [40].
Landsat TM and ETM+ data are the most widely used sources of remotely sensed imagery for forest biomass estimation [36, 37, 38, 53, 57, 58, 63,], but data from other moderate spatial resolution sensors have also been used, including ASTER [60] and Hyperion data [64]. Similarly, high spatial resolution data such as QuickBird [56] and IKONOS [65] have been used for forest biomass estimation. Numerous studies have generated stand attributes from LIDAR data, and then used these attributes as input for allometric biomass equations [66,67,68,69,70]. Other studies have explored the integration of LIDAR and RADAR data for biomass estimation [71,72,73].
GIS-based modeling using ancillary data exclusively, such as climate normals, precipitation data, topography, and vegetation zones is another approach to biomass estimation [74,75]. Some studies have also used geostatistical approaches (i.e., kriging) to generate spatially explicit maps of AGB from field plots [24,28], or to improve upon existing biomass estimation [76]. More commonly, GIS is used as the mechanism for integrating multiple data sources for biomass estimation (e.g., forest inventory and remotely sensed data) [9,37,77,78]. MODIS, JERS-1, QuickSCAT, SRTM, climate and vegetation data have been combined to model forest AGB in the Amazon Basin [27]. Similarly, a combination of MODIS and ancillary data (precipitation, temperature, and elevation) has been used to model AGB over large areas [59].
The advantage of approaches that incorporate some form of remotely sensed data, is through provision of a synoptic view of the area of interest, thereby capturing the spatial variability in the attributes of interest (e.g., height, crown closure) [63]. The spatial coverage of large area biomass estimates that are constrained by the limited spatial extent of forest inventories may be expanded through the use of remotely sensed data [8]. Similarly, remotely sensed data can be used to fill spatial, attributional, and temporal gaps in forest inventory data, thereby augmenting and enhancing estimates of forest biomass and carbon stocks derived from forest inventory data [79]. Such a hybrid approach is particularly relevant for non-merchantable forests where basic inventory data required for biomass estimation are lacking.

Biomass: The Canadian Context
As discussed previously, some methods for biomass estimation, such as the k-nearest neighbor rely on the link between ground plots and satellite imagery [80,81,82,83,84]. Methods such as these are routinely applied in countries where an extensive regional network of field plots is exploited. One of the limitations to using this approach in Canada is the scarcity of ground-based inventory plots. The lack of sufficient and high-quality sample plots has been identified as a major barrier to the development of robust AGB estimates, and to the subsequent validation of these estimates [40]. The total area of forest and other wooded land in Canada covers an estimated 402.1 Mha [85]. Management of these forests is largely under provincial and territorial government jurisdiction, resulting in differences in forest inventory strategies and methods [86]. A new National Forest Inventory program will provide a framework for the provincial and federal governments to harmonize inventory data collection for national reporting purposes [87,88]. In the interim, since some provinces have limited or outdated plot information, there are merits to developing a method for estimating biomass that does not rely on an extensive network of ground plots. The ability to generate a forest stand map from interpreted aerial photography (the typical inventory scenario in Canada) provides an important source of information for estimating and mapping biomass [89]. Furthermore, data collection over large areas is facilitated by the synoptic coverage of satellite images [90,91,92] and may be used as input for models or for mapping forest biophysical attributes [93].
Estimates of forest AGB are required input for Canada's national forest Carbon Monitoring, Accounting and Reporting system [2,94,95], designed to fulfill Canada's reporting requirements under the 1997 Kyoto Protocol [23]. The first national-level forest biomass estimate for Canada was compiled from forest inventory data [1] and subsequently updated [32]. Forest AGB is an important attribute in Canada's new National Forest Inventory [87,88], and national tree AGB biomass equations have been developed [96] and applied [97] to generate regional and national estimates of forest biomass [89].
In Canada, stand level forest inventory data are collected (at the provincial level) to aid foresters with management decisions, the latter of which are largely a function of forest stand volume, by species. Consequently, forest inventory data are structured to provide volume (by species) exclusively for merchantable stands [98]. Merchantable timber is defined as "a tree or stand that has attained sufficient size, quality and/or volume to make it suitable for harvesting" [99]. Only a limited suite of attributes are typically collected (through air photo interpretation) for stands considered nonmerchantable or non-productive, and typically, no structural or volume attributes are recorded. As a result, inventory-to-biomass conversion equations cannot be applied to these stands, resulting in spatial gaps in biomass estimates relying on specific attributes in the forest inventory data [8,32,89]. In the most recent national estimation of biomass, attributional gaps in forest inventory data (i.e., records with no volume data) were circumvented by either (i) generating missing volumes for merchantable stands using lookup tables of average volume per hectare; and, using lookup tables of average biomass per ha, (ii) driven by ecozone and predominant genus (for treed, non-merchantable stands); or (iii) driven by land cover type (for vegetated, non-treed records). For (ii) and (iii), lookup tables were generated using permanent or temporary sample plot data and published information [97].

Objective
Given the advantages of using remotely sensed data for AGB estimation, the integration of multisource data (which includes remotely sensed data) within a GIS is one possible approach to generating spatially explicit estimates of AGB over large areas [40]. The objective of this investigation was to demonstrate how data integration can facilitate spatially explicit AGB estimation over a large area where forest inventory exists but is not spatially exhaustive, and where only a sparse set of field data are available. To achieve this objective, three different approaches to AGB estimation are implemented and reported upon. First, spatially explicit AGB is estimated using forest inventory data that is not spatially or attributionally comprehensive for the study area. Second, spatially explicit AGB for the entire study area is estimated using land cover and vegetation density outputs generated from moderate resolution remotely sensed imagery. Finally, a hybrid approach is followed whereby AGB estimates generated from the forest inventory are augmented with estimates of AGB generated from the remotely sensed data in areas where the forest inventory data has attributional or spatial gaps. The intent is to demonstrate how spatially explicit estimates of biomass may be generated by using the forest inventory data where available, and by filling gaps in these estimates with other data sources in order to provide a complete and spatially explicit representation of biomass resources.

Study Area
The study area, located in central Saskatchewan, Canada, encompasses approximately 714,852 ha and was selected to represent a variety of boreal forest conditions and species types (Figure 1). Located near the Southern Study Area established as a component of the BOReal Ecosystem Atmosphere Study (BOREAS) [100,101], the study area is found largely within the boreal plains ecoregion, with a tree species transition from deciduous-dominated mixedwoods in the southern portion, to coniferous-dominated mixedwoods in the northern portion [102,103]. The area contains mixed boreal forest, composed of aspen (Populus tremuloides) and white spruce (Picea glauca) which are common where the sites are well drained. Jack pine (Pinus banksiana) and black spruce (Picea mariana) are found on dry sites composed of coarse-textured soils. In poorly drained areas, bogs support black spruce and tamarack (Larix laricina). Also present are fen areas, which are composed mostly of sedge vegetation with discontinuous cover of tree species such as tamarack. Forest disturbance is largely the result of localized logging operations and fire. Recent fires have been limited in their spatial extent and frequency through a comprehensive forest fire suppression program [104].
The study area is characterized by a continental climate, with a range of temperature and precipitation conditions found seasonally. For example, monthly precipitation ranges from approximately 19 to 74 mm, between January and July, and constitutes one of the most important limiting factors for ecosystem productivity. Average daily temperature ranges from approximately -22 to 17 °C [105]. The growing season is from March to November and the mean number of growing degree days is 1300 over the range of elevation and latitude within the study area, based upon 1951 to 1980 climate norms [106].
Located within the Saskatchewan Plains Region of the Great Plains Province of North America, the topography of the study area is of gentle relief, with elevations ranging from 400 to 700 m. The dominant landforms consist of glacial till plains, and rolling or hilly moraines. The geomorphological origin of the landforms found in the pilot region are glaciofluvial, glaciolacustrine, fluvial lacustrine, alluvial, and aeolian. As a result, soil development has been upon thick glacial deposits. Soils range from grey wooded to degraded black with brunisolic, gleysolic, chernozemic, luvisolic and organic soil orders [104].

Field Data
As a component of the BOREAS project [101], forest mensuration data were collected at 130 sample locations during the summer of 1994. Published data describe in detail the plot locations [107], measurement methods, and overstorey characteristics [108]. Trees were considered appropriate for overstorey sampling based upon a height greater than 1.3 m. Point-samples (with a Basal Area Factor ranging from 0.394 to 3 m 2 ha -1 ) or fixed-area plots (ranging in size from 25 to 100 m 2 ) were used for overstory measurements [109]. For each overstorey plot, a number of attributes were collected including DBH, species, canopy class, and health status. For a sub-sample of trees within each plot, representing a range of canopy classes and species, additional attributes were collected including: height, crown diameter, and core samples (for age determination). Understorey measurements used fixed area plots of 4 or 25 m 2 and percent cover was estimated visually for all species present [109].
Stand basal area and stem density were calculated from these measured data [109]. Volume, or live stem volume, is an estimate of the total volume of wood in tree stems for the stand including the entire stem, not only the merchantable component of the tree as measured for commercial inventory purposes [108]. Tree volumes were calculated from estimated height (or measured height, if available) and measured DBH, multiplied by the number of stems per ha, and summed to estimate total plot volume [109]. Total AGB for each plot was estimated with regression equations [110], which use height, DBH, and species-specific allometric equations to estimate biomass for each tree [109]. In total, 124 of the BOREAS sample plots were available within our study area.

Figure 1.
The study area is indicated by the black outline. Merchantable forest inventory polygons are shaded grey.

Forest Inventory Data
The inventory system in Saskatchewan is based upon the interpretation of 1:15,000 stereo aerial photographs, with re-inventory conducted on a 15-year (approximate) cycle [111]. Stereo aerial photographs were collected, interpreted, and digitised into a forest inventory GIS product over a 3.5year period, starting in 1984 [111]. Delineation of the land base into homogenous units (hereafter referred to as forest inventory polygons) is guided by biophysical criteria that can be recognized and differentiated from the air photos (e.g., species, density, height) [112]. However, our entire study area was not re-inventoried in 1984, resulting in variable vintages for the forest inventory: approximately 8% of the study area was inventoried prior to 1984, 78% in 1984, and the remaining 14% of the study area was inventoried after 1985. The source year of the inventory was used to project dynamic inventory attributes (e.g. age, height, stocking class) to represent a constant base year, and to coincide with the date of field data collection in 1994. The study area contained 8 coniferous, 84 mixedwood, and 7 deciduous species groups. Forest inventory attributes were used in conjunction with provincially developed lookup tables to assign volumes to the merchantable portions of trees within each merchantable stand. There were 79,056 inventory polygons within the study area, of which 50,247 polygons were merchantable (Figure 1), accounting for 365,572 ha, or 51% of the total study area ( Table 1).

Remotely Sensed Data
Landsat TM data were selected for AGB estimation due to its suitability in terms of resolution and practical considerations associated with its use [40]; the spatial resolution of approximately 30 m by 30 m is adequate to assess information at the forest stand level, and this imagery has a minimum mapping area of about 0.5 ha, which is compatible with standard forestry mapping practices [86,113]. The Landsat TM image, path 37, row 22, acquired July 1994, was georectified using a first-order polynomial rectification, resulting in an RMS error of 0.80 pixels (24m). A top-of-atmosphere correction was applied to convert the image to TOA reflectance [114]. This correction accounts for differences in sensor and viewing geometry, but does not correct for variations in absolute atmospheric conditions.

Remotely Sensed Image Classification
The Landsat TM image was classified using a hyperclustering and labeling approach with an unsupervised K-means algorithm [95]. The original 241 clusters were manually merged to 16 broad land cover classes (Table 2). These classes are similar to those defined in Level 4 of the NFI classification schema [115], with the addition of some of the classes used for large area land cover mapping in Canada with AVHRR data (NBIOME), such as low and high biomass croplands [116]. A sample of aerial photography (1:15,000), collected in 1984 for the forest inventory program, was made available by the Saskatchewan government to assist in assigning the spectral clusters to the appropriate land cover classes.
Following [117], a maximum likelihood approach was used to estimate stand density classes (also following NFI defined categories; Table 3) for each forest stand type (i.e., coniferous, mixedwood, deciduous) identified from the clustering and labeling of the Landsat TM image, as described above. Training data for the density classes were generated by interpreting the aforementioned air photos for a sample of each forest cover type. With the photo-interpreted crown closures as reference data, the six optical Landsat channels and the variance of Landsat TM channel 4 within a 3 by 3 pixel-window were input into a maximum likelihood classification. The variance texture measure is included to provide forest structural information to augment the spectral data [40]. Each stand type, as identified by the satellite image, was assigned a density class (dense, open, and sparse) using the maximum likelihood approach. Thus, from the remotely sensed data, two pieces of information required for AGB estimation were generated: stand type and stand density.

Validation of the classified Landsat imagery
Typically, to assess the accuracy of a remotely sensed image classification a source of ground validation data is required. The ground validation sample should capture all classes found in the classification and cover the range of conditions that may be encountered. These types of ground validation data are not always available [118]. The use of the forest inventory data as a validation source is problematic due to the age of the inventory and the methods used to delineate polygons; these factors often result in a poor agreement between data sources [119]. Additionally, withinpolygon spectral heterogeneity [120], and the method for selecting a pixel within the polygon for comparison, are confounding issues [119]. The sample plot data collected for the BOREAS study included information on vegetation cover types; hence the field plot data were selected as the best source to validate the Landsat image classification results. Unfortunately, the field plots did not contain information for the mixedwood open and sparse classes, or for any the non-forest classes ( Table 2).

Data Compatibility: Field Data and Remotely Sensed Outputs
In order to make comparisons between the BOREAS field plots and the remotely sensed outputs, the field plots had to be assigned a cover type and a density class matching those used in the image classification procedures (Table 2 and Table 3). Species and stem density information in the field plot data were used to assign each of the field plots a corresponding cover type and density class.

Data Compatibility: Field Data and Forest Inventory
In order to make comparisons between the field data and the inventory data, including timber volumes from lookup tables provided by the Saskatchewan government, two issues needed to be resolved: (i) the total stem volumes reported in the field data included merchantable and nonmerchantable portions of all trees, whereas the volume figures provided in the forest inventory lookup tables included only the merchantable portion of trees with a DBH above the merchantable limit; and (ii), there was an age difference of up to 10 years between the field plot and forest inventory data, an amount of time in which trees can grow enough to markedly change the stand volume.
In accounting for non-merchantable and sub-merchantable timber volume in the forest inventory polygons, it was important first to understand how the total volumes reported for the field plots [108] were calculated. By following the steps outlined in the BOREAS documentation and seeking input from other forest inventory specialists where documentation was incomplete, it was possible to determine how the field volumes were calculated. As tree volumes were predicted using Kozak's variable exponent taper function [121], the key to matching the inventory merchantable volumes with those in [108] was in replicating how this formula was applied. The results in [108] were matched using the following procedure: 1. For each of the 3,124 sampled trees, height was calculated as a function of DBH and tree species, using regression equations and parameters listed in [108]. Tree heights were then adjusted using a plot bias multiplier, as described in [108]. 2. Each tree was divided into ten segments-the stump segment was 30 cm in length, and the remaining nine segments were equal in length, i.e., 9 ) 30 Kozak's formula was used to predict the inside-bark diameter at the top, middle, and bottom of each of these segments. The parameters used for Saskatchewan tree species in Kozak's formula were taken from [122]. 3. The volume of the stump segment was calculated using the volume formula for a cylinder, and the volume of the top segment was calculated using the volume formula for a cone. 4. Newton's formula for volume of a neiloid, cone, or paraboloid frustum [123] was used to calculate the volume of the remaining eight segments. 5. The volumes of all segments of each tree were totaled, and the point sampling factors were applied to individual tree volumes to determine the per-hectare volume represented by each tree. Finally, the sum of each of these per-hectare volumes was found, the result being the total live stem volume of the study plot. After completing this routine for all field plots and achieving results matching those presented in [108] (R 2 = 0.99) it was possible to move on to calculating the merchantable volume of each tree for the field data. For this study, merchantable volume was defined as the volume of wood in the stem of a tree between the top of the stump (at height = 30 cm), and the height at which the tree diameter inside bark was 8.01 cm. Finding merchantable volume involved the following steps: 1. The volume of the stump was first subtracted from the total volume of each tree. 2. Kozak's formula was applied in reverse, as described in [121], to determine the merchantable height of each tree, or the height at which the diameter inside bark was approximately 8.01 cm. The formula for volume of a cone was used to calculate the volume of wood above the merchantable height, and this amount was subtracted from the total volume. 3. Point sampling factors were applied to the remaining volume of each tree as in step 5 above, and the merchantable stand volume in cubic meters per hectare was totaled for each study plot. After calculating the merchantable volume of each study plot, the difference between each calculated merchantable volume and the corresponding merchantable volume from the forest inventory database was found, and divided by the number of years between the two data sets. The average yearly difference, or net growth increment, of all study plots was 6.875 m 3 ha -1 yr -1 and this factor was applied to the merchantable volume of the forest inventory polygons, creating a dataset which was now compatible with the field data.
The final step in creating a forest inventory dataset which was compatible with the field data was to find the total stem volume of each forest polygon. A model needed to be found which predicted total stem volume by accounting for the sub-merchantable component which was important in stands with lower merchantable volume. In field plots with higher merchantable volume, the relationship of merchantable volume to total volume was nearly one to one. In field plots with lower merchantable volume, the contribution of the smaller trees to the total volume was greater and more variable. It was observed that, in general, the total volume of small trees in each stand was inversely related to merchantable stand volume, and several models predicting the volume of small trees as a function of merchantable stand volume were explored. Finding a curvilinear function which added a small trees component to the total volume proved unsuccessful, and it was finally decided that a lookup table which found the average sub-merchantable tree volume for a given merchantable volume would be adequate.
Predicting total volume by adding together the volume of large trees and the volume of small trees, both as functions of merchantable volume, was achieved in four steps: 1. Field plots which fell in a single forest inventory polygon were grouped together, and values for sub-merchantable volume, total volume of big trees, and merchantable volume were averaged for these groups of field plots. 2. The average merchantable volumes of the field plot groups were divided into 30 m 3 /ha classes. The variance and average total volume of small trees in each of these classes was found, and a lookup table to find the average volume of small trees for a given merchantable volume was created ( Table 4). The volume of small trees for a given merchantable volume was adjusted in four instances to produce a smoother relationship between the two variables.. 3. For trees larger than 8.01 cm, linear regression analysis was used to predict the total volume of these big trees as a function of merchantable volume. The model, where VOL B is the volume of big trees, and VOL M is the merchantable volume, was found to be accurate, with an R 2 of 0.964. (2) and (3) above were totaled for each plot to find the total stem volume using:

Volumes predicted in
where VOL S is the volume of small, sub-merchantable trees from the lookup table (Table 4).
Total stem volumes were computed for each group of plots using the field data and the forest inventory data, and the results were compared. While a 1:1 relationship between the two sets would have been the ideal result, it was known that fundamental differences in the data sets would preclude this from happening (e.g., forest inventory volumes were in discrete classes, as opposed to the continuous values of the field data). Through the steps outlined above, compatible estimates of merchantable volume and total volume were possible from the field data and the forest inventory data.

Method 1: Biomass estimation from forest inventory
In the BOREAS field data, AGB and total stem volume for each stand were closely related. Figure  2 provides an overview of the process required to estimate biomass for the inventory polygons. The biomass estimation method followed was based upon the development of empirical relationships between the stand volumes assigned to the forest inventory polygons (using methods described above) and volumes from the BOREAS field plots. Of the 124 BOREAS field plots found within our study area, 95 were spatially coincident with the forest inventory polygons. In some cases, multiple field plots were found within a single polygon, or conversely, field plots straddled more than one inventory polygon. In order to build relationships between the plot-level data and the inventory data, weighted averages for field plot volume and biomass were calculated, proportional to the area of the field plot found within the inventory polygon. In total, 52 inventory polygons contained field plots, representing 26 coniferous, 11 deciduous, and 15 mixedwood inventory polygons.
There are two approaches to estimating biomass with the forest inventory data. One method determines the relationship for each structural cover type (i.e., coniferous, deciduous, or mixed) or preferably, each species, individually, and then combines these estimates to obtain an overall estimate of biomass [124]. Alternatively, the relationship between volume and biomass may be determined using a more generic model that encompasses all species or cover types [34,92]. The former method is more accurate, provided there are sufficient ground samples to develop the appropriate allometric relationships; the latter method may be more practical for large areas of mixed cover types [125]. These two approaches were explored when developing a model to predict biomass using total volume estimates from forest inventory lookup tables. Individual cover types (ICT) models were based on regression relationships between total volume estimates and biomass measurements for each of hardwood, softwood, and mixed forest cover types. An all cover types model (ACT) was based on a single regression relationship between total volume estimates and biomass measurements for all forest cover types. Linear models were used in all cases, as biomass estimates from the non-linear models that were explored may have become unreasonably high for volume estimates greater than those used to calibrate the models.
In developing the ICT model, 52 paired AGB and total stem volume measurements were divided into three groups based on field plot forest cover type (i.e., coniferous, deciduous, mixedwood). Linear regression analysis was performed on the three data sets, and the regression parameters were used to calculate biomass in tonnes per hectare (t/ha) for each forest inventory polygon in the study area, using total volumes derived from lookup table estimates as described earlier in the Methods section. Table 5 contains the regression formulas used to estimate biomass for the forest inventory polygons. After applying each of the models in Table 5 to the inventory polygons, the inventory-based biomass estimates were extracted from each polygon containing a corresponding field based estimate of biomass (n = 52). To determine the validity of the models in Table 5, regression relationships between the inventory-based AGB estimates, and field-based AGB were then found for each cover type.

Method 2: Biomass estimation from remotely sensed image outputs
The entire study area is contained within one Landsat TM scene, allowing for biomass estimates for the entire study area to be calculated from the combined cover type and density classes output from the image processing (Table 2 and Table 3). For each cover type-density combination, average AGB values for BOREAS field plots were calculated. These average AGB values were then used as AGB lookup values for the corresponding cover type-density combinations from the remotely sensed data. Since not all of the 16 cover type-density combinations listed in Table 2 were represented by the BOREAS field plots, two issues had to be addressed. First, for the forest cover type-density classes not represented by the field plots (mixedwood open and mixedwood sparse), biomass lookup values (in t/ha) were generated from those forest inventory polygons with similar forest cover type-density attributes. For example, to generate the biomass lookup value for mixedwood open, the mean biomass value was calculated from all the forest inventory polygons with a mixed species composition and a stand density between 31 and 55% (Table 3). Second, AGB lookup values were required for the nonforest land cover classes in the classified Landsat scene. Since these classes were not represented by the BOREAS field plots and corresponded (primarily) to the non-merchantable areas of the forest inventory data, appropriate biomass values were obtained through a literature review and consultation with other researchers familiar with specific cover types (Table 2). Once each cover type-density combination had a corresponding biomass value, a lookup table approach was followed to apply a biomass value to each pixel in the classified image. The lookup values for each of the cover types are provided in Table 6.

Method 3: Biomass estimation using a hybrid approach
The hybrid approach integrates the estimates generated from methods 1 and 2 above to create a complete, spatially explicit estimate of AGB for the entire study area. In this approach, spatial and attributional gaps in the forest inventory that preclude an estimation of biomass are filled by estimates generated from the remotely sensed data. In order to facilitate data integration, the non-merchantable forested inventory polygons are populated with the pixel based estimates from the imagery using a polygon decomposition method [126]. Polygon decomposition is the creation of new polygon attribute values by the summation of the pixels found within a given polygon (e.g., summing of all the pixel based biomass values for a polygon based estimate of biomass). This polygon decomposition approach facilitates the harmonization of the forest inventory estimates and the pixel based estimates from the remotely sensed imagery. For those portions of the study area where the forest inventory does not exist, the remotely sensed biomass estimates were used directly. The result is a spatially contiguous representation of biomass across the entire study area.

Results and Discussion
Applying the two different biomass estimation approaches (from forest inventory or from remotely sensed imagery) led to three possible approaches to estimate and map biomass in the study area: from the forest inventory exclusively, from remotely sensed image outputs exclusively, and from a hybrid approach where the forest inventory estimates were augmented with remotely sensed image estimates, where the forest inventory had spatial or attributional gaps.

Method 1: Biomass estimation from forest inventory
After applying each of the models in Table 5 to the inventory polygons, the inventory-based biomass estimates were extracted from each polygon containing a corresponding field based estimate of biomass (n = 52). To determine the validity of the models in Table 5, regression relationships between the inventory-based AGB estimates, and field-based AGB were then found for each cover type. The results of the validation are summarized in Table 7. The ICT models were shown to be unpredictable, with R 2 values ranging from -0.174 to 0.754, and were based on too few instances of each cover type. In Figure 3, we present the regression relationship between the predicted and field measured biomass values illustrating a moderate positive relationship between the two estimates, with an R 2 of 0.64.  The ACT model was applied to all forest inventory polygons in the database, except those classified as non-merchantable. The AGB estimates derived from the forest inventory ranged, by species, from 1 to 209 t/ha. The remaining non-merchantable areas included waterbodies, disturbed land (including burn-overs and cut blocks), and land not inventoried, such as parks or private lands. Total AGB, representing only 51% of the study area, was estimated at approximately 40 Mt (Table 8).

Validation of the classified Landsat imagery
For the forested classes represented in both the field data and the image classification, overall agreement between the assigned cover types was 80.1%. The overall level of agreement between the plot determinations of density and those derived from the Landsat image classification was 73.7%. The BOREAS field data provides an independent data source for validation, as none of the field plots were used for the unsupervised classification of land cover type or in the creation of the supervised maximum likelihood approach for deriving the density information.

Method 2: Biomass estimation from remotely sensed imagery
The total biomass estimated from the satellite image approach (covering the entire study area) was approximately 58 Mt (Table 9). A regression of the remotely sensed estimates of AGB against the field measured biomass values resulted in an R 2 of 0.54 ( Figure 4). Table 10 summarizes the results of the hybrid or integrated approach to generating an estimate of AGB for the entire study area. Biomass estimates for the non-merchantable classes from method 2 were decomposed to populate the non-merchantable polygons in the forest inventory. Similarly, those portions of the study area where there was no forest inventory data were infilled with the remotely sensed biomass estimates ( Figure 5). Total biomass for the merchantable polygons is the same as for the inventory-exclusive approach (Table 8); total biomass for non-merchantable polygons and noninventoried areas are generated from the remotely sensed data approach (Table 9). Total AGB for the study area was estimated at approximately 62 Mt. The AGB estimated for the merchantable polygons with Method 2 was 5% lower than the AGB estimated from Method 1. This underestimation of forest biomass by remote sensing methods is a phenomena originally identified by [127]. Other studies have found a saturation effect, particularly in complex forest stands in tropical forests, whereby canopy reflectance becomes saturated when AGB approaches a certain level, and this effect, combined with the impact of canopy shadows, limits the ability of the data for detection of additional biomass beyond that level [40]. Given the capability for and the diligence with which the field and inventory data could be harmonized for this study, and the subsequent volume-biomass associations developed, it is not surprising that the there is a stronger positive relationship between the field and inventory estimates of AGB, when compared to the relationship between the field and remotely sensed estimates of AGB.

Method 3: Biomass estimation using a hybrid approach
The robustness of the inventory estimates of AGB are a function of the photo interpretation, the volume tables generated, and the volume to biomass conversion. Similarly, the robustness of the remote sensing estimates must be considered as a function of the land cover classification, the density estimation approach, and the AGB lookup values applied. As common to most studies using GPS under a forest canopy, both approaches will be impacted by the positional accuracy of the field plot locations. Potential users of the AGB estimates must consider the factors influencing the accuracy of the approach used when applying the results.    For this study, the estimates generated from the forest inventory, where available, are assumed to be more robust than those generated from the remotely sensed outputs. As a result, we treat the inventory estimates of AGB as the best available estimates, and then augment these estimates where there are spatial or attributional gaps in the forest inventory. By integrating multiple data sources, we are able to generate a complete picture of AGB for the study area. If we had relied only on the estimate generated from the forest inventory, we would have underestimated total AGB for the study area by 20% because the inventory only provided estimates for 51% of the study area. By integrating the AGB estimates from the two approaches, we are able to generate a more complete estimate representing 100% of the study area. Equally important is the ability to generate a map of AGB in the study area, thereby not only providing a total estimate of biomass, but also indicating the spatial distribution of AGB across our area of interest ( Figure 5). In turn, this spatially explicit estimate can be an important information source for a wide range of applications.

Conclusions
Three different approaches were used to generate an estimate of AGB for our study area. Using conversion factors applied to a non-spatially exhaustive forest inventory dataset produced an AGB estimate with a RMSE = 16.95 t/ha. With this approach, total AGB for the study area was estimated to be 40 Mt. Applying a lookup table to unique combinations of land cover and vegetation density outputs derived from remotely sensed data resulted in biomass estimates with a RMSE = 19.97 t/ha. This approach provided complete coverage for the study area with total AGB estimated at 58 Mt. The hybrid approach combined the increased accuracy of the inventory approach with the more spatially comprehensive remote sensing approach and produced a complete AGB map for the study area, with total AGB estimated at 62 Mt.
The forest inventory approach is the most well understood and commonly applied approach for large area biomass estimation. The remote sensing approach has the advantage in representing the natural variability of forest stands (and is a continuous dataset). The hybrid approach capitalizes upon the strengths of both the forest inventory and remotely sensed data sources, while the process of polygon decomposition facilitates the integration of the remotely sensed estimates into the nonmerchantable forest inventory polygons. The close agreement of the results of the forest inventory and remote sensing approaches is encouraging -where there is spatial concordance between the two approaches, total AGB estimates varied by only 2%.
Biomass maps derived from forest inventory data exclusively are well established and understood by forest managers. These maps are highly correlated to stand volume, which is a well-documented attribute in forestry. The addition of remotely sensed imagery illustrates the importance of nonmerchantable and non-inventoried vegetation for estimating biomass over large areas. The inclusion of the biomass conversion factors, applied to satellite images, for non-merchantable or non-inventoried polygons, adds value to the forest management database. In addition, models that require spatially explicit biomass estimates over extensive spatial coverage may require the comprehensive coverage offered by satellite imagery. This study has demonstrated a biomass mapping solution whereby data integration is shown as a viable approach for generating spatially explicit estimates of AGB over large areas. vector-based forest inventory data. The availability of Saskatchewan BOREAS field data provided by Mike Apps of the Canadian Forest Service (Northern Forestry Centre) was greatly appreciated. Isaac Ferbey, Paul Boudewyn, David Seemann, and Danny Grills of the Canadian Forest Service (Pacific Forestry Centre) are thanked for assistance with analysis and data organization. David Kilshaw, of the BC Ministry of Forests, is thanked for assistance and advice related to photo-interpretation.