Using Satellite Imagery to Evaluate Bark Beetle-Caused Tree Mortality Reported in Aerial Surveys in a Mixed Conifer Forest in Northern Idaho, USA

: Bark beetles cause signiﬁcant tree mortality in western North America. The United States Forest Service coordinates annual insect and disease surveys (IDS) by observers in airplanes to map and quantify the tree mortality caused by beetles. The subjective nature of these surveys means that accuracy evaluation is important for characterizing uncertainty. Furthermore, the metric reported for quantifying tree mortality recently changed (2012–2018 depending in region) from killed trees per acre to percent tree mortality within damage polygons, posing challenges for linking older and newer records. Here we evaluated IDS severity estimates in a beetle-a ﬀ ected forest in northern Idaho, USA using ﬁne-resolution satellite imagery, which permitted greater areal coverage than ﬁeld data. We ﬁrst used well-established methods to map beetle-caused tree mortality in two WorldView-2 (WV2) images with a high accuracy relative to ﬁeld observations. Trees-per-acre measurements within collocated IDS polygons were then converted to percent mortality using three methods and evaluated with the WV2 maps. The overall accuracies for the three methods ranged from 35–38% (for methods that used ﬁve percent-mortality classes) and 49–56% (three classes). When IDS and WV2 estimates of mortality severity that were within ± 15% of each other were considered accurate, overall accuracies were 71–78%. Within the aerial survey damage polygons, the total mortality area tended to be overestimated relative to WV2. WV2 imagery identiﬁed ~50% more mortality across the study region compared with the IDS methods, with most of the di ﬀ erence occurring where damage was low severity or in wilderness areas. Severity of Douglas-ﬁr beetle-caused tree mortality was estimated the most accurately, whereas severity of mountain pine beetle-caused tree mortality was estimated the least accurately. Future studies that control for temporal ambiguity between IDS and satellite imagery, as well as IDS spatial error, might provide better assessments of IDS severity accuracy. Our study increases the usefulness of the rich aerial survey database by providing estimates of uncertainty in the IDS database of tree mortality severity.


Introduction
Forest insects and diseases affect millions of hectares across Canada and the United States each year [1][2][3]. Insect and disease surveys (IDS) provide valuable information about the extent and severity of insect-caused tree mortality and damage, and comprise the only national scale, annual data set documenting mortality, damage agents, and host tree species impacted. IDS surveys are conducted annually by the United States Forest Service (USFS) across most forested public land in the conterminous United States. Trained surveyors collect observations of recent forest mortality by drawing polygons on maps and recording damage attribute information that includes beetle and host species and a measure of damage severity (tree mortality in the case of bark beetles) from fixed-wing aircraft and helicopters. Annual trends in tree mortality and damage inferred from IDS data are especially valuable to forest managers and researchers for monitoring and anticipating future tree mortality [4]. Studying long-term trends in relation to other factors can reveal causes and patterns of mortality that are not apparent with annual datasets [5][6][7].
Recently, the USFS Forest Health Assessment and Applied Sciences Team (FHAAST) updated the methodology and tools to assist aerial survey observers, with the goal of standardizing IDS data and making the IDS data more compatible with remote sensing observations. Surveys were collected under the Digital Aerial Sketch Mapping (DASM) system [8] but are now collected under the Digital Mobile Sketch Mapping (DMSM) system [9]. As a part of this new system, damage severity, formerly reported as killed trees per acre, is now surveyed in five classes of percent mortality within the area damaged (1-3%, 4-10%, 11-29%, 30-50%, and >50%). In the western US, the conversion occurred during 2012-2018, depending on region.
The analysis of longer-term trends of bark-beetle-caused tree mortality requires consistency in the disturbance metrics reported across years. Thus, methodologies are needed to convert legacy trees-per-acre measurements to percent mortality measurements. Three methods currently exist: (1) the remote-sensing-based method of Meddens et al. [2], hereafter referred to as Meddens, (2) the histogram-matching method of FHAAST [10], hereafter referred to as FHAAST, and (3) the categorization of trees-per-acre measurements into broad classes (1-10%, 11-30%, >30% mortality area) via the method of Egan et al. [11], hereafter referred to as FHP.
The aerial surveys are conducted by observers, and thus the subjective information recorded may be associated with higher uncertainty because of variability in flying and viewing conditions, timing of flights relative to host responses, and surveyor methods and experience, such as how surveyors choose to represent tree mortality, e.g., polygon size is decided by the surveyor. Evaluations of aerial survey-based estimates of mortality with remotely sensed imagery have been conducted for a few locations and times [2,12,13]. However, additional assessments are needed in different situations to help characterize the accuracy of aerial surveys, including evaluations of both legacy and current survey methods.
Here we used a satellite-derived map of tree mortality to evaluate the accuracy of IDS estimates of percent tree mortality. We first classified WorldView-2 (WV2) remote sensing imagery in an area affected by multiple bark beetle species in Idaho, USA, and evaluated this mortality map with field observations. We then applied the three existing methods listed above to convert legacy trees-per-acre values to newer percent mortality values. Finally, we evaluated the different estimates of mortality from aerial surveys using the WV2-based mortality map as our reference standard.

Study Area
An area of recent, elevated bark beetle activity with various bark beetle and host tree species in the Nez Perce National Forest near Elk City, Idaho, USA was chosen as a study area (Figure 1). Tree species within the mixed conifer forest included grand fir (Abies grandis (Douglas ex D. Don) Lindl.), Douglas-fir (Pseudotsuga menziesii (Mirb.) Franco), Engelmann spruce (Picea engelmannii Parry ex Engelm.), lodgepole pine (Pinus contorta Douglas ex Loudon), ponderosa pine (Pinus ponderosa Lawson & C. Lawson), subalpine fir (Abies lasiocarpa (Hook.) Nutt.), and western larch (Larix occidentalis Nutt. (Pinaceae)). Dominant mortality agents included the fir engraver beetle (Scolytus ventralis LeConte) in grand fir, the Douglas-fir beetle (Dendroctonus pseudotsugae Hopkins) in Douglas-fir, the spruce beetle (Dendroctonus rufipennis Kirby) in Engelmann spruce, and the mountain pine beetle (Dendroctonus ponderosae Hopkins) in lodgepole pine. Tree mortality caused by all of these mortality agents had been ongoing since 2014, the beginning of our study period. Spruce and Douglas-fir beetle-caused tree mortality became more elevated in 2016 and 2017, respectively. occidentalis Nutt. (Pinaceae)). Dominant mortality agents included the fir engraver beetle (Scolytus ventralis LeConte) in grand fir, the Douglas-fir beetle (Dendroctonus pseudotsugae Hopkins) in Douglas-fir, the spruce beetle (Dendroctonus rufipennis Kirby) in Engelmann spruce, and the mountain pine beetle (Dendroctonus ponderosae Hopkins) in lodgepole pine. Tree mortality caused by all of these mortality agents had been ongoing since 2014, the beginning of our study period. Spruce and Douglas-fir beetle-caused tree mortality became more elevated in 2016 and 2017, respectively.

Field Observations
We established plots to measure relevant forest and mortality characteristics in our study area in 2018. To sample a variety of tree species killed by various mortality agents at various mortality severities with these plots, we stratified the study area by: (1) the four dominant IDS mortality agents within the study area (fir engraver beetle, Douglas-fir beetle, spruce beetle, and mountain pine beetle); (2) five levels of percent mortality (very light (1%-3%), light (4%-10%), moderate (11%-29%), severe (30%-50%), and very severe (>50%)) as indicated by 2017-2018 IDS data; and (3) five levels of percent mortality (the same as those of IDS) as indicted by a classified 2017 WV2 image of tree mortality (classified and accuracy assessed by visual inspection, aggregated to 30-m resolution). This stratification resulted in 100 possible strata within 2017 and 2018 IDS polygons (i.e., four mortality agents × five IDS severity classes × five WV2 severity classes = 100 possible strata). However, many of these combinations did not exist, leading to 42 actual strata. We also placed plots outside 2017 and 2018 IDS polygons, where 20 strata were possible (four mortality agents × five WV2 severity classes = 20 strata), however, only 14 of these strata existed. This led to a total of 56 existing strata. Random plot locations were placed within each existing stratum.

Field Observations
We established plots to measure relevant forest and mortality characteristics in our study area in 2018. To sample a variety of tree species killed by various mortality agents at various mortality severities with these plots, we stratified the study area by: (1) the four dominant IDS mortality agents within the study area (fir engraver beetle, Douglas-fir beetle, spruce beetle, and mountain pine beetle); (2) five levels of percent mortality (very light (1-3%), light (4-10%), moderate (11-29%), severe (30-50%), and very severe (>50%)) as indicated by 2017-2018 IDS data; and (3) five levels of percent mortality (the same as those of IDS) as indicted by a classified 2017 WV2 image of tree mortality (classified and accuracy assessed by visual inspection, aggregated to 30-m resolution). This stratification resulted in 100 possible strata within 2017 and 2018 IDS polygons (i.e., four mortality agents × five IDS severity classes × five WV2 severity classes = 100 possible strata). However, many of these combinations did not exist, leading to 42 actual strata. We also placed plots outside 2017 and 2018 IDS polygons, where 20 strata were possible (four mortality agents × five WV2 severity classes = 20 strata), however, only 14 of these strata existed. This led to a total of 56 existing strata. Random plot locations were placed within each existing stratum.
The 56 stratified plot locations were visited in August and September 2018. At these 56 locations we measured all standing trees >10 cm diameter at breast height (DBH, 1.37 m) on 400-m 2 (1/10-acre) fixed-radius plots. DBH, tree species, health status (live, dead, or unhealthy based on needle condition), mortality agent (identified by hosts and bark beetle gallery patterns), estimated years since death (based on needle/branch condition, see Table A1), crown class (dominant, codominant, intermediate, suppressed), and distance and bearing from the plot center were measured for each tree. Plot center locations were measured with a Trimble Geo 7x GNSS receiver and were subsequently differentially corrected through post processing. Estimated horizontal accuracy of plot center locations after differential correction averaged 0.52 m, with a standard deviation of 0.24 m. Individual tree coordinates were calculated from plot center locations using distance and bearing field measurements. Tree locations from these field observations were used to develop and evaluate maps of tree mortality from satellite imagery (described in Section 2.3.). We originally planned to use field observations to directly assess IDS accuracy, in addition to using satellite imagery, but found that our plots under-sampled, and therefore poorly represented, IDS polygons, which were very large (median area of 4 ha) relative to our field plots (0.04 ha). Hence, we limited our accuracy assessment to a comparison between satellite-derived maps and IDS polygons.

WorldView-2 Processing and Classification
We chose fine-resolution (2 m), WorldView-2 (WV2) satellite imagery for evaluating IDS data. Well-established, highly accurate methods of classifying tree mortality with such imagery exist (e.g., [14][15][16]). Using satellite imagery allowed us to evaluate IDS data over a much greater area than possible with field observations. WV2 collects eight-band multispectral imagery; band names and wavelengths are coastal (400-450 nm), blue (450-510 nm), green (510-580 nm), yellow (585-625 nm), red (630-690 nm), red edge (705-745 nm), near-infrared 1 (770-895 nm), and near-infrared 2 (860-1040 nm). Two WV2 scenes were acquired over the study area on 28 June and 1 September 2018 ( Figure 1). We used standard (2A) products, which are radiometrically and sensor corrected and normalized for topographic relief with a coarse digital elevation model (DEM). A comparison of the imagery with a fine-resolution orthoimage showed the WV2 imagery to be well-corrected for terrain distortions. Imagery was converted from digital numbers (DN) to top-of-atmosphere radiance using the equation: where L is top-of-atmosphere radiance in units of W µm −1 m −2 sr −1 , GAIN and OFFSET are band-dependent calibration factors provided by DigitalGlobe, abscalfactor is the absolute calculation factor, and effectivebandwith is the effective band width; abscalfactor and effectivebandwith are given in metadata delivered with the imagery [17]. Values of L were then converted to top-of-atmosphere (TOA) reflectances using the equation: where TOA is top-of-atmosphere reflectance, L is radiance calculated in Equation (1), dist is the Earth-Sun distance in astronomical units for the date of image acquisition, Esun is the top-of-atmosphere solar radiation in the band of interest, and Θ is the solar zenith angle for the date and time of image acquisition, defined as 90 • minus the mean sun elevation (meanSunEl) from the metadata [17]. TOA reflectances were converted to surface reflectances using empirical line calibration (ELC) [18,19]. ELC was implemented by finding several bright (rooftops) and dark (water bodies) surfaces within each WV2 scene to represent 95 percent and 0 percent reflectance, respectively. A simple linear model predicting surface reflectance from TOA reflectance was developed for and applied to each TOA band to produce surface reflectance images. Normalized difference vegetation index (NDVI) and red green index (RGI) images were created for classification. NDVI and RGI are defined as: where NIR is the near-infrared band (WV2 band 7, 770-895 nm), RED is the red band (WV2 band 5, 630-690 nm), and GREEN is the green band (WV2 band 3, 510-580 nm). Image processing was done in R [20]. The imagery was classified into one of five classes: live trees, dead trees (red and gray), green herbaceous vegetation, brown herbaceous vegetation, and bare ground. Individual fieldmeasured trees, live and dead, were related to individual WV2 pixels; these pixels were selected for image classification. Pixels of green herbaceous, brown herbaceous, and bare ground cover for classification were identified through visual inspection of imagery, as was done by Meddens et al. [15] and Bright et al. [16]. For each of these five classes, 50% of the identified pixels were reserved for classification training and 50% were reserved for classification evaluation. Maximum likelihood classification of imagery into the five previously mentioned classes was implemented in ERDAS IMAGINE software, using green, red edge, near-infrared 2, NDVI, and RGI indices/bands as classification variables. Shadows and water bodies were masked using the red band (red < 4.1% reflectance). A dead tree grid, where dead tree pixels were set to one and pixels in other classes were set to zero, was created from the classified imagery for further analysis. Our high classification accuracy (see Results) allowed us to use these satellite-based classifications as reference data for evaluating the IDS polygon data.

IDS Polygon Processing
We downloaded IDS polygon data [21] and clipped regional geodatabases to our study area extent. The mean and median IDS polygon sizes were 21 and 4 ha, respectively, and ranged from 0.8 to 4140 ha. Chronic bark beetle activity occurred in the study area for several years prior to 2018, so overlapping IDS polygons from multiple years were common. For comparing IDS to the WV2 classifications within our study area, we combined all IDS polygons from 2014-2018. We compared five years of previous IDS polygons to the WV2 classifications because pixels identified in the satellite imagery as dead could have been killed in one of the prior 1-5 years. Trees killed >5 years previous had likely lost most branches and needles [14,22], may have begun falling [23][24][25], and therefore would not be detected by WV2.
IDS collection methods varied between the years 2014-2018. For the years 2014-2015, severity was recorded as trees per acre using the DASM system. During 2016-2018, the mortality severity for each IDS polygon was recorded by surveyors as a percent mortality class (i.e., the DMSM system). We estimated percent mortality for these polygons to be the mid-point of each classification class, as specified in the DMSM system (i.e., very light, 2%; light, 7%; moderate, 20%; severe, 40%; and very severe, 75%). To convert trees-per-acre estimates for 2014-2015 IDS data to percent-mortality estimates, we used three different methods:

Meddens
Meddens et al. [2] developed a 1-km gridded dataset of crown area mortality from IDS polygon data across western North America for the years 2001-2010. To convert trees per acre to crown mortality area, Meddens et al. [2] multiplied the number of killed trees estimated by IDS by the mean crown area of the tree species. Adjustment factors that compensated for IDS trees-per-acre underestimation bias were developed by comparing estimates of the number of killed trees from IDS data with estimates from satellite imagery. To implement the method of Meddens et al. [2], we calculated a continuous measure of percent mortality for each IDS polygon using the formula: where PM is percent mortality, TPA is the trees-per-acre measurement of the IDS polygon, and CROWN is the average tree crown area (species specific) of the recorded IDS host ( [2], Appendix A) in acres. The 13.6 multiplier was used to compensate for IDS underestimation bias [2,26]. Then, for each year, IDS polygons were converted to 2-m percent-mortality grids. Annual percentmortality grids were summed to one cumulative percent-mortality grid. The resulting percent-mortality grids were converted back to a vector polygon layer of cumulative mortality, in which adjacent areas of identical mortality area represented polygons, no overlapping polygons existed, and individual polygons could represent one or several years of IDS mortality. We converted cumulative percent-mortality grids back to polygons to facilitate analysis of individual IDS cumulative polygons, our units of analysis.

FHAAST
The histogram-matching method developed by FHAAST seeks to convert legacy trees-per-acre measurements into percent-mortality measurements through histogram analysis [10]. First, the proportion of the total area within each of the five percent-mortality classes was calculated for mortality area that IDS surveyors documented for USFS Region 1 for the years 2016-2018. Next, 2014-2015 IDS polygons recorded using the DASM system were sorted by trees per acre, and trees-per-acre breakpoints were identified that divided IDS polygons into five classes having the total area proportions of the 2016-2018 percent-mortality classes (Table 1). Our implementation of the FHAAST method was similar to what is described in the Meddens method, except that 2014-2015 IDS polygons were converted to 2-m annual trees-per-acre grids; these annual trees-per-acre grids were summed to create a cumulative trees-per-acre grid, which was then converted to a five-class percent-mortality grid for the years 2014-2015 using the trees-per-acre breakpoints and percent-mortality class midpoints as determined by the FHAAST method for USFS Region 1, 2014-2018 (Table 1). This percent-mortality grid was then added to 2-m percent-mortality grids for the years 2016-2018 and converted back to a vector polygon layer of cumulative mortality. Note that we first summed 2014-2015 trees-per-acre measurements to cumulative trees-per-acre measurements before converting to percent-mortality measurements, rather than converting annual trees-per-acre estimates, an important nuance that affected our cumulative percent-mortality measurements.

FHP
The FHP method is based on the relationship between trees per acre and percent mortality as measured on 329 fixed-area ground plots (30,386 trees) containing Dendroctonus-killed yellow pine across the United States [11]. Trees-per-acre measurements are converted into three percent-mortality classes of 1-10%, 10-30%, and >30%. Motivations for using three percent-mortality classes included: (1) acknowledgment that aerial surveyors can estimate three levels of mortality severity with high accuracy, but estimate five classes or continuous measures with less accuracy; (2) resource managers indicated that these three severity classes were useful for indicating when mortality conflicted with management objectives; and (3) examination of IDS and field data showed that variability increased in the relationship between trees-per-acre and observed tree mortality as severity increased, with natural breakpoints at 10% and 30% mortality.
To determine TPA thresholds that corresponded with 10% and 30% mortality, three regression models predicting natural log-transformed trees per acre from natural log-transformed percent mortality were developed; one model for the entire dataset, one for a data window near 10% mortality, and one for a data window near 30% mortality. Trees per acre was predicted at 10% and 30% mortality using these three models, and lower 95% confidence interval values were averaged and rounded to the nearest base-10 to determine the optimal trees-per-acre values associated with 10% and 30% mortality, which were coincidentally 10 and 30 trees per acre, respectively. Classifying the 329 plots using these trees-per-acre thresholds resulted in an overall classification accuracy of 84%.
We implemented the FHP method identically to the FHAAST method, except that the mid points of the trees-per-acre threshold values determined by Egan et al. [11] were used to convert the cumulative 2014-2015 trees-per-acre grid to a three-class cumulative percent-mortality grid ( Table 2). This percent-mortality grid was then added to 2-m percent-mortality grids for the years 2016-2018 and converted back to a vector polygon layer of cumulative mortality.

IDS and WorldView-2 Comparisons
To compare IDS to the WV2 estimates of tree mortality, cumulative 2014-2018 percent-mortality measurements of IDS polygons as determined by each of the three methods were classified into three-(1-10%, 11-30%, and >30%) and five-class (1-3%, 4-10%, 11-29%, 30-50%, and >50%) percent-mortality estimates. We calculated percent mortality of the WV2 dead tree grid within each IDS cumulative mortality polygon, and likewise converted WV2 continuous measures of percent mortality to threeand five-class percent-mortality measurements. IDS and WV2 classified mortality estimates were then assessed via confusion matrices for each of the three methods [27], where the unit of analysis was a cumulative IDS polygon.
We also compared IDS to the WV2 estimates of tree mortality by differencing continuous WV2 severity estimates from cumulative IDS severity estimates for each of the three IDS methods. We considered polygons where differences were ±15% to be in agreement. We also assessed IDS-WV2 severity agreement by mortality agent and by visual comparison within a geographic information system (GIS). Recent burns, clouds, and cloud shadows were masked from analysis.
We calculated total mortality area across both WV2 scenes as determined by WV2 (within IDS polygon extents and across the entire WV2 extent) and as estimated by each IDS method. WV2 dead tree grids were aggregated to 90-m grid cells (0.81 ha in area), approximately the minimum mapping unit area of an IDS polygon, to derive percent-mortality estimates of WV2 across scenes.

Field Observations
We measured 1361 trees on 56 plots. The tree mortality on plots averaged 29% and ranged from 0-85%. The estimated years since death of beetle-killed trees measured on plots showed that mortality had been ongoing for several years, consistent with IDS data (Figure 2). Mortality agents identified by hosts and bark beetle gallery patterns in the field included the Douglas-fir beetle, fir engraver, mountain pine beetle, and spruce beetle, as well as two mortality agents not identified in sampled IDS polygons: the western pine beetle (20 trees) and the western balsam bark beetle (4 trees). The mortality agent could not be identified for 40.6% of trees, especially for older dead trees.

WorldView-2 Classification Accuracy
Using our field observations, we found that the overall lifeform classification accuracies of the June 28 and September 1 WV2 imagery were 95% and 96%, respectively (Tables 3 and 4, Figure 3). The brown herbaceous class was not used in the 28 June classification because vegetation had not senesced yet. Dead trees were only infrequently confused with live trees and bare ground. The WV2 classification detected all types of disturbance, including fire, and low-level mortality across the study area. Clouds obstructed much of the June 28 imagery, making processing and analysis of the imagery challenging. Shaded cells indicate correctly classified pixels. Shaded cells indicate correctly classified pixels.

WorldView-2 Classification Accuracy
Using our field observations, we found that the overall lifeform classification accuracies of the June 28 and September 1 WV2 imagery were 95% and 96%, respectively (Tables 3 and 4, Figure 3). The brown herbaceous class was not used in the 28 June classification because vegetation had not senesced yet. Dead trees were only infrequently confused with live trees and bare ground. The WV2 classification detected all types of disturbance, including fire, and low-level mortality across the study area. Clouds obstructed much of the June 28 imagery, making processing and analysis of the imagery challenging.

Comparison of IDS and WorldView-2 Severity Estimates
Using the WV2 imagery as reference data, we found low overall accuracies of mortality severity from the different IDS methods of 35%-56%. The FHAAST method outperformed the Meddens method in the five-class comparison, with overall accuracies of 37.8 and 34.5%, respectively (Table 5; see Appendix A for full confusion matrices). The 1%-3% and 4%-10% classes were frequently confused in both methods (Tables A2 and A3). Although >50% cumulative mortality area occurred frequently in cumulative IDS polygons, actual percent mortality (according to WV2 imagery) rarely reached >50% mortality (Tables A2 and A3) within IDS polygons.
The FHAAST and FHP methods performed the best among the IDS methods in the three-class comparison with overall accuracies of 55.5% and 55.4%, respectively. The Meddens method achieved an overall accuracy of 49.0% (Tables 5, A4-A6). Table 5. Overall accuracies (%) for each method converting 2014-2018 insect and disease survey (IDS) severity estimates from legacy trees-per-acre measurements to percent mortality.

Comparison of IDS and WorldView-2 Severity Estimates
Using the WV2 imagery as reference data, we found low overall accuracies of mortality severity from the different IDS methods of 35-56%. The FHAAST method outperformed the Meddens method in the five-class comparison, with overall accuracies of 37.8 and 34.5%, respectively (Table 5; see Appendix A for full confusion matrices). The 1-3% and 4-10% classes were frequently confused in both methods (Tables A2 and A3). Although >50% cumulative mortality area occurred frequently in cumulative IDS polygons, actual percent mortality (according to WV2 imagery) rarely reached >50% mortality (Tables A2 and A3) within IDS polygons.
The FHAAST and FHP methods performed the best among the IDS methods in the three-class comparison with overall accuracies of 55.5% and 55.4%, respectively. The Meddens method achieved an overall accuracy of 49.0% (Tables 5 and A4-A6). Table 5. Overall accuracies (%) for each method converting 2014-2018 insect and disease survey (IDS) severity estimates from legacy trees-per-acre measurements to percent mortality.
Polygons of Douglas-fir beetle-caused mortality were estimated most accurately (73-94% accurate); the FHP and FHAAST methods estimated Douglas-fir beetle-caused mortality more accurately (93-94%) than the Meddens method (73%). Polygons of mountain pine beetle-caused mortality, the majority, were estimated least accurately (66-70% accurate) and tended to be overestimated by 15-50% (20-24%; Tables 6-8). Severity estimates of fir engraver, spruce beetle, and other mortality agents were intermediate in accuracy between Douglas-fir beetles and mountain pine beetles.    (14) 32 (2) Within IDS polygon extents, the IDS estimates of total cumulative mortality area were 25-32% greater than that of WV2 (Figure 4). The FHP method estimate of total mortality was closest to that of WV2 and estimated 25% more mortality area than WV2. Total mortality areas estimated for low and moderate severity (<30%) were close between WV2 and each of the three methods, whereas the IDS methods estimated more total area of severe mortality within IDS polygon extents. and moderate severity (<30%) were close between WV2 and each of the three methods, whereas the IDS methods estimated more total area of severe mortality within IDS polygon extents. When the entire region was considered, i.e., both areas within and outside IDS polygons in both the June and September WV2 image extents, the IDS estimates of total mortality area were 48%-51% of those of the WV2 estimate (total mortality area of 14,251 ha). This occurred because WV2 detected a large amount of mortality, especially low-severity mortality (1%-10%), outside of IDS polygons ( Figures 5 and 6), and because surveyors did not either detect or record a large amount of severe mortality in the Gospel Hump Wilderness Area in the southern portion of the 28 June image extent, despite most of this area being reported as being surveyed annually since 2014 ( Figure 5). When the entire region was considered, i.e., both areas within and outside IDS polygons in both the June and September WV2 image extents, the IDS estimates of total mortality area were 48-51% of those of the WV2 estimate (total mortality area of 14,251 ha). This occurred because WV2 detected a large amount of mortality, especially low-severity mortality (1-10%), outside of IDS polygons ( Figures 5 and 6), and because surveyors did not either detect or record a large amount of severe mortality in the Gospel Hump Wilderness Area in the southern portion of the 28 June image extent, despite most of this area being reported as being surveyed annually since 2014 ( Figure 5). Overall spatial locations of mortality estimated by IDS and WV2 agreed, especially in areas of the highest insect activity (Figures 5 and 6). Disagreement of mortality severity tended to be greatest in areas of repeat mortality, where IDS polygons overlapped more frequently (Figures 5 and 6). Overall spatial locations of mortality estimated by IDS and WV2 agreed, especially in areas of the highest insect activity (Figures 5 and 6). Disagreement of mortality severity tended to be greatest in areas of repeat mortality, where IDS polygons overlapped more frequently (Figures 5 and 6).

Discussion
We found moderate agreement between tree mortality severity estimated by the IDS and tree mortality as mapped by fine-resolution satellite imagery. Overall accuracies of severity measurements that we found (35%-56% when assessed by confusion matrices and 71%-78% when assessed by differencing) are similar to those found by Coleman et al. [13], who also compared the IDS data to ground observations and fine-resolution satellite imagery. In a comparison of 68 polygons that were <2 ha in size, Coleman et al. [13] found that the annual IDS mortality severity estimates (trees per acre) were correct 22% of the time, overestimated 41% of the time, and underestimated 36% of the time. Backsen and Howell [12] found TPA distributions of IDS and photo interpretation of aerial imagery to be comparable, with mean values of 10.2 and 13.8, respectively. Like Coleman et al. [13], we found that IDS tended to overestimate mortality severity within polygon extents. Unlike

Discussion
We found moderate agreement between tree mortality severity estimated by the IDS and tree mortality as mapped by fine-resolution satellite imagery. Overall accuracies of severity measurements that we found (35-56% when assessed by confusion matrices and 71-78% when assessed by differencing) are similar to those found by Coleman et al. [13], who also compared the IDS data to ground observations and fine-resolution satellite imagery. In a comparison of 68 polygons that were <2 ha in size, Coleman et al. [13] found that the annual IDS mortality severity estimates (trees per acre) were correct 22% of the time, overestimated 41% of the time, and underestimated 36% of the time. Backsen and Howell [12] found TPA distributions of IDS and photo interpretation of aerial imagery to be comparable, with mean values of 10.2 and 13.8, respectively. Like Coleman et al. [13], we found that IDS tended to overestimate mortality severity within polygon extents. Unlike Coleman et al. [13], we assessed cumulative estimates of mortality severity, in which annual IDS severity estimation errors were compounded and surveyors perhaps may not have distinguished new mortality from previous mortality. Additionally, we analyzed polygons that were >2 ha in size, which have a greater likelihood of containing unaffected forest and thereby exaggerating mortality estimates [28].
We compared three methodologies for converting legacy trees-per-acre measurements of severity to percent mortality. The FHP and FHAAST methods performed the best. However, our comparison of these three methods was limited to the years 2014 and 2015 when transformations from trees per acre to percent mortality were required. Additional analyses that compare these three methods across the entire IDS data archive are needed for a more definitive assessment of which method converts legacy trees-per-acre estimates to percent mortality most reasonably. Both Meddens et al. [2] and Backsen and Howell [12] found that IDS trees-per-acre values underestimated field observations of killed trees per acre, hence the inclusion of the 13.6 multiplier in the Meddens method (Equation (5)); the FHP and FHAAST methodologies convert trees-per-acres values with this known underestimation bias to realistic percent mortality classes. The Meddens method has the advantage of providing continuous estimates of mortality severity derived from legacy trees-per-acre measurements. However, based on the generally poor to moderate accuracies achieved when comparing IDS and WV2 severity measures, continuous measures of mortality severity derived from legacy trees-per-acre measurements using the Meddens method might convey a false sense of precision. The FHP method acknowledges this limitation of IDS severity accuracy, hence its use of three classes. Using three classes to quantify mortality severity rather than five classes yielded higher overall accuracies, as expected statistically since three classes require less specificity to correctly classify relative to five. Conversion factors of the FHAAST method can vary with the reference period used for histogram matching. In preliminary analyses, we used conversion factors developed across a broader reference period, which resulted in poorer performance by the FHAAST method in our study area.
The IDS polygons representing Douglas-fir beetle-caused mortality tended to be the most accurate in terms of severity, whereas IDS polygons representing mountain pine beetle-caused mortality tended to be the least accurate. Possible reasons for this include: (1) much of the Douglas-fir mortality being fairly recent (2017), so that killed tree crowns had not deteriorated since IDS surveying and were therefore well detected by WV2; (2) the large, average size of Douglas-fir crowns relative to those of lodgepole pine in a mixed conifer stand made severity estimation by surveyors more accurate; and (3) Douglas-fir beetle polygons averaged the smallest in area relative to polygons of other tree hosts, so that estimates by surveyors were more likely to be correct.
The WV2 classification identified substantial mortality outside the IDS polygons (i.e., missed by the aerial surveyors). Cumulative mortality detected by the satellite imagery across the two WV2 scenes was~50% more than the mortality area estimated from the aerial surveys. However, much of that missed damage was in the low-severity class (<10% mortality), which may be difficult to identify by surveyors. Substantial mortality within the Gospel Hump Wilderness Area was also not reported by IDS. Although records indicated that most of this area was surveyed annually from 2014-2018, it is possible that it was surveyed incompletely, as wilderness areas are of lower priority in IDS surveys.
We found general spatial patterns of tree mortality as detected by surveyors and satellite imagery to be similar, although assessment of IDS spatial error was beyond the scope of our analysis. We assumed that IDS polygons were mapped accurately, although the spatial accuracy of IDS polygons has been shown to be accurate at only coarse scales. Johnson and Ross [29] examined IDS spatial accuracy in 233 plots across Wyoming, Colorado, South Dakota, and Nebraska and found IDS spatial accuracy to be 61%, 68%, and 79%, when spatial tolerances of 0, 50, and 500 m, respectively, were applied. Backsen and Howell [12] found a significant difference between IDS polygons extents and the extents of polygons derived from photo interpretation of aerial imagery; by applying a 91.4 m buffer to polygons, extents became more comparable. The severity estimate accuracies that we found would likely be improved if we had accounted for spatial differences between IDS and WV2. Future studies that control for IDS spatial error could assess IDS severity accuracy independently of IDS spatial error.
The multi-year occurrence of continuous tree mortality across our study area complicated our comparison of IDS and WV2 imagery and might explain some discrepancies between IDS-and WV2-derived estimates of tree mortality severity. We assumed that killed trees either had lost all needles/fine branches or would no longer be standing five years after death [14,[22][23][24][25] and would therefore not be visible on WV2 imagery; hence our comparison of WV2 with the previous five years of IDS data (2014-2018). In reality, some trees killed from 2014-2018 likely fell or deteriorated quickly, losing needles and fine branches, and were not visible on imagery. Conversely, some trees killed before 2014 were likely still standing and might have been visible on WV2 imagery. Future studies that compare satellite observations with IDS data for a more discrete mortality event in which no mortality occurred prior to outbreak initiation would not suffer from this temporal complication and might provide a better assessment of IDS severity accuracy. For example, a severe outbreak in a pine forest, which tend to retain needles longer than other conifers [22,30], could support further evaluation efforts. Also, individual or small clumps of trees within our study area may have been missed by IDS surveys and/or defoliation by the western hemlock looper (Lambdina fiscellaria lugubrosa (Hulst)) population that was building in the area in 2018 may have been detected by WV2 as tree mortality.
We only assessed the IDS severity accuracy and the IDS methods across a relatively small spatial extent in Nez Perce-Clearwater National Forest in US Forest Service Region 1. Due to the subjective nature of the aerial survey method, analyses done in additional study areas (where aerial survey methods might differ) or across a larger spatial extent might result in different conclusions.
Satellite imagery and IDS provide different, complementary estimates of tree mortality severity. Satellite imagery can provide wall-to-wall estimates across large areas, although clouds can obscure mortality, as we found in our analysis. Satellites detect all types of disturbance, which might be disadvantageous, and distinguishing different conifer species and mortality agents is also difficult to accomplish, even with fine-resolution remote sensing. Conversely, the IDS data include host tree and damage agent information, which aerial surveyors can distinguish with high accuracy [13]. However, surveyors have only a limited amount of time to document mortality, defoliation, and discoloration in the tree canopy. Furthermore, not all areas are surveyed every year, and some wilderness areas are only flown over infrequently, whereas moderate satellite imagery has provided complete coverage of most areas since 1984. Although we and others have found IDS data to be fairly accurate [13], mapping accuracy and repeatability is dependent on a variety of human factors and errors.

Conclusions
Our use of satellite-derived maps of tree mortality from bark beetles allowed us to evaluate aerial survey estimates. Due to the recent change in survey systems, we used three different methods to link estimates of mortality severity using the older system (trees per acre) with estimates from the newer system (percent mortality). We found that the FHP and FHAAST methods yielded the highest accuracies. The aerial surveys did not detect substantial low severity levels of mortality in the study region very well, thus producing a significantly lower estimate of the total mortality area than that derived from the satellite imagery. Our study adds to the limited understanding of the accuracy of aerial surveys, which are a key resource for understanding forest disturbances in the western US. Future studies that control for temporal ambiguity between IDS and satellite imagery (by, for instance, analyzing a time series of imagery) as well as IDS spatial error (by allowing for some tolerance in spatial position) might provide additional understanding of IDS mortality severity accuracy.      Shaded cells indicate correctly classified polygons.