Mapping Multiple Insect Outbreaks Across Large Regions Annually Using Landsat Time Series Data

: Forest insect outbreaks have caused and will continue to cause extensive tree mortality worldwide, affecting ecosystem services provided by forests. Remote sensing is an effective tool for detecting and mapping tree mortality caused by forest insect outbreaks. In this study, we map insect-caused tree mortality across three coniferous forests in the Western United States for the years 1984 to 2018. First, we mapped mortality at the tree level using field observations and high-resolution multispectral imagery collected in 2010, 2011, and 2018. Using these high-resolution maps of tree mortality as reference images, we then classified moderate-resolution Landsat imagery as disturbed or undisturbed and for disturbed pixels, predicted percent tree mortality with random forest (RF) models. The classification approach and RF models were then applied to time series of Landsat imagery generated with Google Earth Engine (GEE) to create annual maps of percent tree mortality. We separated disturbed from undisturbed forest with overall accuracies of 74% to 80%. Cross-validated RF models explained 61% to 68% of the variation in percent tree mortality within disturbed 30-m pixels. Landsat-derived maps of tree mortality were comparable to vector aerial survey data for a variety of insect agents, in terms of spatial patterns of mortality and annual estimates of total mortality area. However, low-level tree mortality was not always detected. We conclude that our methodology has the potential to generate reasonable estimates of annual tree mortality across large extents.


Introduction
Forest insect outbreaks have caused widespread tree mortality worldwide in recent years [1]. Although insect-caused tree mortality has been ongoing for millennia [2], recent climate conditions have facilitated insect outbreaks by increasing tree vulnerability and altering insect population dynamics [1,[3][4][5]. Extensive tree mortality caused by insect outbreaks affects crucial ecosystem services provided by forests such as water supply [6,7], carbon storage [8], wildlife habitat [9], and the availability of forest products [10]. Accurate estimates of the extent and severity of insect-caused tree mortality are important for determining to what degree ecosystem services have been affected [11][12][13]. In this study, we demonstrate a methodology for mapping insect-caused tree mortality annually across large extents using Landsat time series analysis.
The United States Department of Agriculture (USDA) Forest Service and partner agencies conduct annual insect and disease surveys (IDS) to monitor insect activity across the nation's forests. Surveyors aboard fixed-wing aircraft estimate the spatial extent, insect mortality agent, tree species host, and severity of insect-caused tree damage. IDS data are not only useful for monitoring and anticipating future insect-caused tree mortality [14,15], but are also valuable for research applications investigating the causes and effects of forest insect outbreaks [8,16,17]. However, IDS data accuracy is limited; surveyors only have a short time to estimate often extensive tree damage; surveying conditions can be less than ideal; surveyor experience can influence accuracy; and some areas such as wilderness are surveyed only infrequently, leading to data gaps [18]. In addition, IDS data coverage might become less extensive in the future than recent years as costs increase and budgets decrease; therefore, the USDA Forest Health Protection program responsible for overseeing IDS is making efforts to incorporate satellite remote sensing in future forest health monitoring [19].
Satellite imagery is becoming a logical alternative to aerial surveys for monitoring and mapping insect-caused tree mortality. Landsat satellites, with moderate spatial resolutions of 30 m and temporal revisit times of 16 days, are particularly well-suited for this task. Furthermore, the Landsat TM mission of satellites has been in operation since 1984, making it possible to perform long-term time series analyses. Several algorithms have been developed for detecting general forest change through time with Landsat time series analysis [20]; these algorithms include Vegetation Change Tracker (VCT) [21], Landsat-based detection of Trends in Disturbance and Recovery (LandTrendr) [22], Breaks for Additive Seasonal and Trend (BFAST) [23], and Continuous Change Detection and Classification (CCDC) [24]. Google Earth Engine (GEE) has recently made the application of such algorithms more accessible to users [25]; for instance, the LandTrendr algorithm and associated Landsat preprocessing tasks have been implemented in GEE [26].
Previous work has shown how Landsat time series analysis can be used to estimate continuous levels (i.e., percent of a 30-m pixel) of severe insect-related forest disturbances (see Senf et al. [27] for a review). Meigs et al. [28] showed that LandTrendr was sensitive to continuous levels of defoliator and bark beetle damage in Oregon, USA. Townsend et al. [29] developed models predicting canopy defoliation severity in broadleaf deciduous forests in the eastern USA. Meddens et al. [30] and Meddens and Hicke [31] demonstrated a methodology for predicting continuous mountain pine beetle damage through time in Colorado, USA; in this methodology, tree mortality is mapped using high-resolution satellite imagery, which is then used to 1) classify forested Landsat pixels as disturbed or undisturbed, and 2) predict the continuous percent tree mortality of disturbed pixels. Assal et al. [32] and Long and Lawrence [33] followed a similar approach to Meddens and Hicke [31] to estimate continuous mountain pine beetle-caused tree mortality in Montana, USA. Meigs et al. [34] predicted and mapped bark beetle and defoliator damage in terms of basal area across Oregon and Washington, USA. Woodward et al. [35] used a time series of Landsat data to map tree mortality severity caused by spruce beetles in Colorado, USA. Most previous studies have demonstrated the use of Landsat imagery for predicting mountain pine beetle-caused tree mortality; more studies are needed that demonstrate the use of Landsat image time series for predicting continuous tree mortality caused by other mortality agents.
To extend the work of Meddens et al. [30] and Meddens and Hicke [31], we investigated how their methodology performed in three additional study areas in the Western United States that have experienced tree mortality caused by a variety of mortality agents. Our approach differed from that of Meddens et al. [30] and Meddens and Hicke [31] in that we 1) used GEE [25] to generate Landsat image time series and implement the algorithm of Meddens and Hicke [31]; and 2) we used Random Forest (RF) modeling rather than linear modeling to predict continuous tree mortality. We conclude that the methodology of Meddens et al. [30] and Meddens and Hicke [31], implemented within GEE, provides an effective means for mapping insect-caused tree mortality annually across large spatial extents.

Study Areas
We chose three areas in the Western United States where insect activity has been elevated in recent years, and where classified high-resolution multispectral imagery was available. Study areas included the 1) Stanley, Idaho vicinity, 2) Chimney Park, Wyoming vicinity, and 3) Elk City, Idaho vicinity ( Figure 1). The forest around both Stanley and Chimney Park is dominated by lodgepole pine (Pinus contorta Douglas ex Loudon), with Douglas-fir (Pseudotsuga menziesii (Mirb.) Franco) and ponderosa pine (Pinus ponderosa Lawson and C. Lawson) occurring at lower elevations, and Engelmann spruce (Picea engelmannii Parry ex Engelm.), subalpine fir (Abies lasiocarpa (Hook.) Nutt.), limber pine (Pinus flexilis James), and whitebark pine (Pinus albicaulis Engelm.) occurring at higher elevations. The forest around Elk City is mixed-conifer consisting of grand fir (Abies grandis (Douglas ex D. Don) Lindl.), Douglas-fir, ponderosa pine, lodgepole pine, Western white pine (Pinus monticola (Douglas ex D. Don)), Western larch (Larix occidentalis Nutt.), Engelmann spruce, and subalpine fir.
Mountain pine beetle (Dendroctonus ponderosae Hopkins) outbreaks began in 2000 and 2005 in the Stanley and Chimney Park study areas, respectively. Other prominent mortality agents in the Stanley and Chimney Park study areas include Western spruce budworms (Choristoneura occidentalis Freeman) in Douglas-fir and subalpine fir, Douglas-fir beetles (Dendroctonus pseudotsugae Hopkins) in Douglas-fir, and spruce beetles (Dendroctonus rufipennis Kirby) in Engelmann spruce. In the Elk City study area, insect activity has been elevated for many years and includes several mortality agents: mountain pine beetles in lodgepole pine, Douglas-fir beetles in Douglas-fir, fir engravers (Scolytus ventralis LeConte) in grand fir, spruce beetles in Engelmann spruce, Western pine beetles (Dendroctonus brevicomis LeConte) in ponderosa pine, Western balsam bark beetles (Dryocoetes confuses Swaine) in subalpine fir, and balsam woolly adelgids (Adelges piceae (Ratzeburg)) in grand and subalpine fir.
Terrain is generally mountainous in all study areas with elevations that range from 1080 to 3270, 2000 to 4140, and 370 to 2710 m above sea level for the Stanley, Chimney Park, and Elk City study areas, respectively. Temperature averages 2.0, 2.7, and 3.8 °C, and annual precipitation averages 654, 381, and 800 mm for the Stanley, Chimney Park, and Elk City study areas, respectively (1961 to 1990) [36].

Field Observations
Field observations of live and beetle-killed trees in the Stanley and Elk City study areas were gathered in the summers of 2010 and 2018, respectively. Field observations were not gathered in the Chimney Park study area. As part of a previous project [37], we stratified the Stanley study area by biomass using pre-outbreak NDVI to sample in a variety of forest conditions. The Elk City study area was stratified by mortality severity and agent using 2017 to 2018 IDS data, to ensure sampling across a range of mortality conditions and agents [38]. In both study areas, we located plot centers following stratified random sampling designs, and navigated to plot centers with professional-grade Trimble GNSS receivers in the field.
At plot centers, we established circular 400 m 2 fixed-radius plots (radii of 11.3 m) and measured all trees within plot extents with a diameter at breast height (DBH) >7 cm and >10 cm in Stanley and Elk City, respectively. In addition to DBH, we measured tree condition (live or dead), species, canopy dominance, mortality agent, estimated time since death (based on needle, branch, and bole condition, see Bright et al. [38], Table A1) or how many needles were remaining on dead trees, and distance and bearing from plot center. We logged several hundred positions at plot centers using the GNSS receivers, and differentially corrected plot center locations so that estimated plot center accuracies were <1 m. Plot center locations along with distance and bearing measurements were used to calculate individual tree locations, which were then used to select pixels for image classification.

High-Resolution and Landsat Imagery
High-resolution multispectral imagery was acquired over the Stanley, Chimney Park, and Elk City study areas in 2010, 2011, and 2018, respectively (Figure 1). High-resolution imagery characteristics varied by study area ( Table 1). Imagery of the Stanley study area consisted of digital aerial orthophotography, whereas Chimney Park and Elk City imagery came from the QuickBird and WorldView-2 satellites. Landsat imagery was processed in and acquired from the LandTrendr implementation in GEE, which uses USGS Landsat Surface Reflectance Tier 1 imagery data sets.

Imagery Processing
In each of the three study areas, high-resolution imagery was used to create maps of tree mortality, which were then aggregated to a spatial resolution of 30-m, and used as reference data to 1) classify Landsat imagery as non-forest, disturbed forest and undisturbed forest, and 2) predict percent tree mortality of disturbed pixels ( Figure 2). Models predicting percent tree mortality were then applied to Landsat time series spanning the years 1984 to 2018. Each of these steps are described in more detail below. Overview of how field observations, high-resolution imagery, and Landsat imagery were used to create maps of percent tree mortality. This process was repeated for each of the three study areas.

High-Resolution Multispectral Imagery Processing
Image classification routines for each study area resulted in the separation of beetle-killed trees from live trees. The routines all used maximum likelihood classification and similar classification variables, including the Normalized Difference Vegetation Index (NDVI) and Red-Green Index (RGI), which are defined as: where NIR is the near-infrared band, RED is the red band, and GREEN is the green band. Classification accuracy was evaluated using confusion matrices and visual comparison with truecolor imagery. For details on the classification of the Stanley aerial orthoimagery, see Bright et al. [37]. Briefly here, digital aerial orthoimagery was aggregated by averaging 0.2-m pixels to a spatial resolution of 2.4 m, the approximate average crown area of canopy trees in the imagery extent; aggregating imagery to 2.4 m pixels increased classification accuracy by reducing confusion between sunlit and shadowed tree crowns [39]. Sixteen-bit digital numbers (DN) were divided by 65,535 so that pixel values ranged from 0 to 1. Field observations were used to identify live and beetle-killed tree pixels on aggregated imagery. Two-thirds of the identified pixels were used for classification training; the remaining third were used for classification evaluation. Classification consisted of several steps: 1) water was classified using the NIR band (NIR <0.15 indicated water); 2) shadows were separated from non-shadow pixels using the RED band (RED <0.15 indicated shadow); 3) a canopy height model (CHM) derived from coincident light detection and ranging data (lidar) was used separate forest and non-forest (CHM <0.5 m indicated non-forest); 4) pixels with a RGI >1.08 were classified as newly-killed red trees, and the remaining pixels were classified as either green or gray trees using NIR, NDVI, and RGI in a maximum likelihood classifier. Water, shadow, and red tree thresholds were determined from visual inspection of true-and false-color imagery [37,39]. Overall lifeform classification accuracy was 87%.
The Chimney Park QuickBird satellite imagery was classified similarly to the Stanley imagery, except no field observations were available for calibration/evaluation. DNs in 11 bits were divided by 2048 to convert pixels to values between 0 and 1. The topography of the Chimney Park study area is relatively flat; healthy trees and red and gray beetle-killed tree pixels were easily identified on trueand false-color QuickBird imagery ( Figure 1). Pixels of green and brown herbaceous cover and nonvegetated surfaces such as bare soil and road were also identified on imagery. Six hundred pixels of each class were identified across the image; half were used for classification training and half for evaluation. Water and shadow pixels were classified with threshold values of NIR <0.07 and RED <0.04, respectively, and the remainder of classes were classified in a maximum likelihood classifier using GREEN, NDVI, and RGI as classification variables. Overall lifeform classification accuracy was 97%.
Elk City WorldView-2 imagery was delivered as a standard (2A) product, which had been orthocorrected using a coarse digital elevation model (DEM). Imagery appeared to be well-corrected for terrain when compared with high-resolution orthoimagery. Imagery was converted from DNs to top-of-atmosphere (TOA) reflectance using equations recommended by DigitalGlobe [40]. TOA reflectance imagery was then converted to surface reflectance using empirical line calibration [41,42]. We selected live and dead tree pixels on imagery using field-measured tree locations. Visual inspection of imagery was used to select pixels of green and brown herbaceous cover, as well as nonvegetated surfaces. Half of the pixels were used for classification training; the remainder were used for classification evaluation. Imagery was classified using the GREEN, red-edge (WV2 band 6), NIR (WV2 band 8), NDVI, and RGI as classification variables in a maximum likelihood classifier. Shadows were masked with the RED band (RED <4.1% indicated shadow). Overall lifeform classification accuracy was 96%.
High-resolution classifications for each study area were then aggregated to 30-m grids to be used for comparison with Landsat imagery. We created a mode aggregation grid for each study area, where aggregated 30-m pixels took the value of the modal class, as well as grids of percent area by class, including percent dead tree grids to be used as response variables in predictive models.

Landsat Imagery Processing
Annual Landsat image time series for the years 1984 to 2018 coincident with high-resolution imagery were created using functions from the LandTrendr implementation in GEE [25,26]. Newer Landsat-8 OLI imagery (launched in 2013) differs spectrally from older Landsat-5 TM and Landsat-7 ETM+ imagery. To account for this difference when building imagery time series that include both older and newer Landsat-8 OLI imagery, the LandTrendr implementation in GEE transforms Landsat-8 OLI bands to Landsat-7 ETM+ band equivalents using the equations of Roy et al. [43] (Table  2). Using the 'buildSRcollection' function, we created annual mediod (value closest to the median) composite images of surface reflectance for the years 1984 to 2018 from cloud-, shadow-, and snowfree imagery between the dates of June 20 and September 20. Annual time series of individual bands and indices were then created using the 'transformSRcollection' and 'collectionToBandStack' functions ( Table 2). Note that we did not run the LandTrendr algorithm on our time series data to identify disturbed pixels, but rather used the algorithm of Meddens and Hicke [31], which we describe below. We chose to use the preprocessing routines available in the LandTrendr implementation of GEE because we found them to be well-documented and straightforward. The iterative method of Meddens et al. [30] was used to create time series of B5/B4 and NBR anomalies from undisturbed means (B54A, NBRA). To create undisturbed mean B5/B4 and NBR images, mean images of B5/B4 and NBR for all years (1984-2018) were created; pixels that were more than one standard deviation from means in the direction of disturbances (positive for B5/B4 and negative for NBR) were identified as disturbed pixels, and mean images without disturbed pixels were recalculated. This process was repeated 10 times until means stabilized to create undisturbed mean B5/B4 and NBR images. One assumption of this approach is that forested pixels were undisturbed some time during the years 1984 to 2018.
Time series of B54A and NBRA for the years 1984 to 2018 were then created by subtracting undisturbed mean B5/B4 and NBR images from annual B5/B4 and NBR images, respectively. B54A and NBRA were used for classification of predominantly live (undisturbed) or predominantly dead (disturbed) tree cover; B54A was also used as a candidate predictor variable in models predicting percent mortality. Landsat imagery preprocessing was performed in GEE, and output from GEE for classification and predictive modeling analysis in R [44].

Landsat Classification
Following the classification method of Meddens et al. [30], we created a classification of nonforest, disturbed forest, and undisturbed forest for each of the three study areas. Landsat classifications were created for the years that corresponded to high-resolution imagery years; 2010, 2011, and 2018 for the Stanley, Chimney Park, and Elk City study areas, respectively. Forest was separated from non-forest using a Band 2 (B2) threshold, where the darkest pixels corresponded to forest [45]. Further separation of forest from green herbaceous cover was possible by using a Tasseled-Cap Greenness (TCG) threshold, where greater TCG values represented green herbaceous cover. B2 and TCG thresholds that maximized classification accuracy were chosen to separate nonforest and forest pixels. Forested pixels were then classified as undisturbed or disturbed using B54A or NBRA thresholds that maximized classification accuracy of undisturbed and disturbed forest classes. Classification accuracy was assessed using high-resolution imagery as reference data and confusion matrices [46].

Prediction of Percent Mortality with Random Forest Modeling
Following an approach similar to that of Meddens and Hicke [31], we developed models predicting percent tree mortality, as determined from aggregated high-resolution classifications, from Landsat variables ( Table 2). We chose to use nonparametric Random Forest (RF) modeling [47,48] to avoid possible violations of normal linear regression [49]. In order to give various mortality severities equal weight in RF models, we took equal-sized pixel samples from 10% mortality classes of 0%-10%, 10%-20%, 20%-30%, etc., and combined samples for modeling. We eliminated one variable from highly correlated (r > 0.9) variable pairs; the variable that was least correlated with percent mortality, the response variable, was eliminated from correlated pairs. The 'rf.modelSel' function of the 'rfUtilities' package in R was then used to select the best model [50]. The 'rf.modelSel' function creates scaled variable importance scores for each variable, and then tests models that include variables above various importance thresholds; the model that explains the most variation in the response variable with the fewest predictor variables is selected as best. We developed four RF models total; one model for each study area and a general model combining an equal number of samples from each study area (500 observations from each study area for a total of 1500 observations). Ten-fold cross-validation implemented in the 'caret' package of R [51] was used to assess RF model performance in predicting new observations.
Final RF models predicting percent tree mortality for each study area were applied to Landsat time series (1984-2018) predictor variable grids that were 1x1 degrees in size and centered on study areas ( Figure 1). Models were applied to pixels classified as disturbed in both the current and following year because we assumed, like Meddens and Hicke [31], that Landsat did not detect initial low-level mortality typical of insect outbreak outset. Pixels classified as non-forest or undisturbed were assigned a value of zero. Pixels that burned, as indicated by MTBS fire perimeter polygons [52], as well as pixels where timber harvest was documented [53], were also assigned a value of zero. We created annual grids of new mortality by subtracting mortality grids from the mortality grid of the previous year. Model application was done with the 'raster' package in R [54].

Comparison with Insect and Disease Aerial Survey Data
We compared total annual mortality area across 1x1 degree extents as estimated by our RF models with annual estimates of total mortality area from USDA insect and disease survey (IDS) polygon data [55], and visually compared Landsat-derived and IDS estimates of tree mortality severity. IDS aerial surveyors estimated mortality severity in terms of trees per acre (TPA) prior to 2016, although some USFS Regions adopted the percent mortality metric as early as 2012. To convert legacy TPA estimates to percent mortality so that total mortality area within IDS polygons could be computed, we used two methodologies: Egan et al. [56], which provides conservative estimates of mortality severity [38], and the methodology of Meddens et al. [57], which provides more liberal mortality severity estimates. The methodology of Egan et al. [56], which utilizes the relationship between surveyed field TPA and percent mortality plot data to adjust IDS TPA estimates, assigns percent mortality values of 5%, 20%, and 65% to IDS polygons having TPA estimates from 0 to 9.9, 10 to 30, and > 30, respectively. The methodology of Meddens et al. [57], which is based on a comparison of high-resolution remote sensing data and IDS TPA, estimates percent mortality with the following: where PM is percent mortality, TPA is the legacy trees per acre (TPA) attribute of the polygon and CA is the species-specific average crown area, in acres, of the recorded tree host [57] (Appendix A); the 13.6 multiplier compensates for IDS underestimation bias for most tree species/damage agent combinations [57,58]. IDS polygons of abiotic (0.6% of polygons) or insect-caused tree mortality without TPA measurements were assigned a TPA value of 2, the median for all IDS polygons. Our aim was to predict and map tree mortality. However, initial comparison of our tree mortality maps with IDS polygons showed that our approach also detected biotic defoliation or mortality caused by defoliation, hence we included IDS polygons representing defoliation damage when validating our Landsat-derived maps with independent IDS polygon data. IDS polygons of defoliator damage were assigned a percent mortality value of 10% [59][60][61], the median for all IDS polygons after conversion from TPA.

Landsat Classification into Non-Forest, Undisturbed Forest, and Disturbed Forest
Overall classification accuracies ranged from 74% to 80% for pixels that contained > 50% of a given class (Table 3, Tables A1-A3). We found that using the B2 mean or the B2 mean plus one standard deviation worked well for separating forest from non-forest. TCG thresholds of the mean TCG plus one or two standard deviations further separated forest from green herbaceous cover. B54A consistently separated disturbed from non-disturbed forest with greater accuracy than NBRA, so we used B54A for final classifications. B54A thresholds that maximized class accuracies ranged from 0.165 to 0.307 (Table 3, Figure 3).

Prediction of Percent Tree Mortality with Random Forest Modeling
Ten-fold cross-validated RF models explained 61% to 68% of the variation in percent mortality (Table 4, Figure 4). TCB, NDVI, and NBR were the most important variables in RF models for the Stanley, Chimney Park, and Elk City study areas, respectively. TCG and B54A were selected as predictor variables in all study areas. B5, NBR, and TCW were selected as predictor variables in two study areas. B54A and NBR were the most important variables in the general model. Spatial patterns of tree mortality severity as predicted by Landsat variables matched those of high-resolution imagery ( Figure 5).  . Percent tree mortality as observed by high-resolution multispectral imagery versus percent tree mortality as predicted by Landsat variables via cross-validated random forest models. Highresolution multispectral imagery for Stanley, Chimney Park, and Elk City was digital aerial orthophotography, QuickBird, and WorldView-2, respectively.

Figure 5.
Comparison of true-color high-resolution imagery, high-resolution classifications, and percent mortality as estimated by Landsat variables in random forest (RF) models.

Comparison of Landsat-Derived Percent Tree Mortalty Maps with Aerial Survey Data
Annual trends of total mortality area as predicted by Landsat were generally similar to those of IDS, with some discrepancies (Figure 6). In Stanley, both Landsat and IDS detected elevated levels of tree mortality from 1985 to 1994, a period of low tree mortality from 1995 to 2001, and elevated levels of mortality from 2001 to 2017. A large defoliation event was documented by IDS in 2015; Landsat also detected tree damage in the same area, but with less total area ( Figure 7B).
In Chimney Park, both Landsat and IDS detected little or no tree mortality until 2005 and a large mortality event from 2005 to 2012 ( Figure 6). In 2011, at the peak of tree mortality, IDS estimated greater total mortality area (52 to 112 kha), than Landsat (24 kha).
Annual trends of total mortality area were most dissimilar between Landsat and IDS in Elk City ( Figure 6); Landsat detected continuous mortality levels of 2-10 kha yr -1 throughout the entire time series (1984 to 2018), whereas IDS did not show elevated tree mortality levels >1 kha yr -1 until 1997. IDS showed a peak in mortality of 7-14 kha yr -1 in 2001 and 2002; Landsat detected mortality in the same vicinity, but at lower levels of 4-5 kha yr -1 . Although annual trends differed, overall mortality levels ranged from 0.1-14 kha yr -1 for both Landsat and IDS throughout the time series, and Landsat and IDS trends post-2002 tracked each other ( Figure 6).
Visual comparison of Landsat-derived maps of tree mortality with IDS polygons generally showed good agreement for a variety of insect agents, especially for moderate and severe forest mortality (Figure 7). Agreement was less apparent for low forest mortality; Landsat often detected forest disturbance in and around the vicinity of low severity IDS polygons, but not always ( Figure  7B,E,F). Although tree mortality caused by fires and harvest, as well as non-forest, were masked, Landsat maps also showed mortality outside IDS polygons ( Figure 7E,F). Both the Stanley and Elk City 1x1 degree study areas contained large portions of wilderness areas in which Landsat maps revealed substantial unsurveyed tree mortality.

Discussion
We estimated insect-caused tree mortality caused by a variety of mortality agents across three large areas for the years 1984 to 2018 using Landsat time series analysis. We separated non-forest, undisturbed, and disturbed forest with overall accuracies of 74% to 80% and explained 61% to 68% of the variation in percent tree mortality using Landsat variables. Similar to others, we found the Landsat shortwave-infrared to near-infrared ratio (B54) index to be a good indicator of change in coniferous forest [31,62,63]. The B54 index was likely sensitive to changes in forest structure caused by forest mortality, rather than directly sensing changes to vegetation physiology. For our application, the detection of insect-caused canopy disturbance, B54 outperformed NBR, another popular index for documenting forest change. However, NBR, as well as NDVI and Landsat Tasseled-Cap variables, were still found to be important predictors of the degree of insect-caused forest decline. Rates of forest disturbance that we estimated within our study areas (0.1% to 3% of forest area yr -1 ) were similar to the national estimates of Cohen et al. [64] (1.5% to 4.5% of forest area yr -1 ).
We found GEE [25] and the preprocessing functions of the LandTrendr implementation in GEE [26] to be powerful tools for simplifying the processing of Landsat time series data. Before GEE, the task of preprocessing and managing Landsat time series data required significant resources and time. What required days or weeks of work can now be completed in hours with the functions available in the LandTrendr implementation of GEE, which we found to be simple to understand and execute. The simplicity and efficiency of GEE makes the application of our method across large areas feasible.
RF modeling provided several benefits over linear modeling. First, RF allowed for nonlinear relationships between response and predictor variables. Although many of the relationships between our response variable, percent tree mortality, and Landsat predictor variables were approximately linear, some, particularly B54A, were nonlinear. Second, RF modeling is better suited to modeling continuous percent tree mortality, which can be zero-inflated, i.e., having many values near zero. Zero-inflated data can result in heteroskedastic variance of errors, a violation of normal linear regression [49]. Third, the RF algorithm partitions data into training and evaluation datasets internally. Although we performed ten-fold cross-validation to rigorously evaluate our RF models, we found that model diagnostics of cross-validated models were nearly identical to model diagnostics of non-cross-validated models, suggesting that cross-validation was not necessary and the out-of-bag estimate of error provided by RF was sufficient.
Our methodology has the potential to be applied across large spatial extents. The B2 and TCG thresholds that we used to separate forest from non-forest, as well as the B54A threshold values that we used to separate undisturbed from disturbed forest were comparable across our three study areas (Table 3). In addition, our general RF model combining observations from all study areas performed similarly to RF models of individual study areas (Table 4, Figure 4). This suggests that we could apply this methodology across a large region, such as an entire wilderness area, national forest, state, or Forest Service Region. However, caution should be used when applying algorithms and models outside areas where those models were calibrated, as accuracy might decrease in new areas [63,65]. The use of several calibration sites distributed across the larger area of interest, as well as applying algorithms and models only within similar forest types, would help to minimize inappropriate extrapolation outside calibration areas. Few previous studies have applied Landsat time series analysis for the detection of insect-caused tree mortality across large study areas [27,31,34].
Detection of lower-severity forest disturbance with Landsat remains a challenge. Insect-caused tree mortality usually occurs gradually and at lower levels relative to fire or harvest, making it more challenging to detect with moderate-resolution satellites. Numerous previous studies have shown that Landsat time series analysis can be used to detect high-severity forest disturbance, where most trees within a 30-m pixel are disturbed, with high accuracy [30,32,49,66]. However, lower-severity disturbance, where fewer trees are disturbed within a 30-m pixel, is detected with less accuracy [30,63,66]. Although we did not specifically examine classification accuracy by disturbance severity, comparison of our Landsat-derived maps with IDS polygons showed that low-severity disturbance as detected by IDS surveyors was not always detected with our Landsat time series analysis, as expected.
Comparison of our Landsat-derived mortality maps with IDS showed some discrepancies, especially in areas of low tree mortality (Figures 6 and 7). However, IDS data are a fundamentally different, vector-based representation of disturbance, so differences are expected. For example, IDS showed a large amount of defoliator-caused tree damage in our Stanley study area in 2015, while Landsat showed much less total area affected ( Figure 6). Examination of IDS polygons within a geographic information system (GIS) showed that IDS polygons were large and highly generalized ( Figure 7B). Our Landsat-derived maps detected damage in the same areas, but perhaps more precisely. Meigs et al. [34] made a similar observation that IDS polygons are a generalization of damage area that can exaggerate overall area affected, whereas Landsat methods can give more resolved damage estimates at 30-m resolution. So, while our estimates of low-level damage did not always agree with IDS polygons, Landsat representations of low-level tree damage, when detected, are likely more accurate representations of diffuse tree damage than large generalized IDS polygons.
Another notable difference between our Landsat-derived maps and IDS was in the Elk City study area for the years before 1997, where Landsat detected considerable disturbance and IDS showed little. This discrepancy could be caused by limited IDS survey coverage in the 1980s and 1990s, when IDS flights often targeted specific insect agents and some areas were missed on a regular basis. We could not verify this because no information on survey extent was available for IDS data prior to 1997. Our Landsat methodology could have also detected non-insect-related forest disturbance, such as undocumented forest harvest, small fires, or forest decline such as that caused by drought [64]. Indeed, Landsat identified considerable mortality in 1987, the driest year of the years 1980 to 2018 in the Elk City vicinity. Although we masked areas of documented forest harvest, many of the areas identified as being disturbed in the Elk City area prior to 1997 occurred adjacent to areas of timber harvest, particularly along the edges of harvested areas. This highlights one limitation of our approach, that beyond masking large burns and documented harvest with ancillary data, disturbance is not separated by type.
Attributing causes of forest change with satellite time series analysis is a topic of current research [27], where others have made some progress. Senf et al. [67] successfully distinguished between mountain pine beetle mortality and Western spruce budworm damage in British Columbia, Canada using LandTrendr-derived spectral trajectories. Meigs et al. [34] used IDS polygons as ancillary information for distinguishing between forest damage caused by mountain pine beetles and Western spruce budworms. Cohen et al. [64] used TimeSync [68] and ancillary datasets to distinguish harvest and fire from forest decline across the United States. Oeser et al. [69] distinguished between windthrow, bark beetles, and harvest in central Europe with intra-annual Landsat time series analysis. Vogeler et al. [70] distinguished various types of forest disturbance in Minnesota, USA using Landsat time series analysis. Future work could implement similar approaches that use spectral trajectory differences or incorporate ancillary data to further distinguish tree mortality by type.

Conclusions
We estimated continuous insect-caused tree mortality in three study areas in the Western United States that have experienced recent forest insect outbreaks, and generated maps of continuous tree mortality estimates across large extents for the years 1984 to 2018 for a variety of insect mortality agents. We adapted the previously developed methods of Meddens et al. by using Google Earth Engine (GEE) to pre-process Landsat imagery, and developing nonparametric Random Forest models to predict percent tree mortality. GEE simplified the preprocessing of Landsat time series imagery, making the application of these methods across large spatial extents and the Landsat satellite time record feasible and useful.  The use and/or dissemination of this data and/or of any product in any way derived there from are restricted. Unauthorized use and/or dissemination is prohibited. The QuickBird and WorldView-2 imagery used in this study was obtained through the NextView license agreement.

Conflicts of Interest:
The authors declare no conflict of interest.