1 An evaluation of forest health Insect and Disease 2 Survey data and satellite-based remote sensing forest 3 change detection methods : Case studies in the United 4 States 5

The Operational Remote Sensing (ORS) program leverages Landsat and MODIS data to 15 detect forest disturbances across the conterminous United States (CONUS). The ORS program was 16 initiated in 2014 as a collaboration between the US Department of Agriculture Forest Service 17 Geospatial Technology and Applications Center (GTAC) and the Forest Health Assessment and 18 Applied Sciences Team (FHAAST). The goal of the ORS program is to supplement the Insect and 19 Disease Survey (IDS) and MODIS Real-Time Forest Disturbance (RTFD) programs with imagery20 derived forest disturbance data that can be used to augment traditional IDS data. We developed 21 three algorithms and produced ORS forest change products using both Landsat and MODIS data. 22 These were assessed over Southern New England and the Rio Grande National Forest. Reference 23 data were acquired using TimeSync to conduct an independent accuracy assessment of IDS, RTFD, 24 and ORS products. Overall accuracy for all products ranged from 77.64% to 93.51% (kappa 0.0925 0.59) in the Southern New England study area and 59.57% to 79.57% (kappa 0.09-0.45) in the Rio 26 Grande National Forest study area. In general, ORS products met or exceeded the overall accuracy 27 and kappa of IDS and RTFD products. This demonstrates the current implementation of ORS is 28 sufficient to provide data to augment IDS data. 29


Introduction
The US Forest Service's Forest Health Protection (FHP) program is tasked with protecting and improving the health of America's rural, wildland, and urban forests.The Operational Remote Sensing (ORS) program was developed as a part of the Forest Health Assessment and Applied Sciences Team's (FHAAST) Insect and Disease Survey (IDS) analysis and decision support tools.These tools assist FHP staff, state forestry agencies, and other forest managers to locate, monitor, and map forest health issues.FHAAST supports a number of technology development programs, such as ORS, to assist their partners, increase accuracy, and lower the cost of forest health protection.

Aerial Detection Survey
The ORS program was developed to address the data requirements of the IDS program.IDS is a collection of geospatial data depicting the extent of insect, disease, and other forest disturbance types.In the past, IDS data collection largely came from first-hand aerial observations and sketch mapping via the Aerial Detection Survey (ADS).Supplemental field surveys also contributed to the IDS database.Aerial sketch mapping is now conducted using either Digital Aerial Sketch Mapping (DASM) or Digital Mobile Sketch Mapping (DMSM) systems.These systems consist of tablet computers, GPS units, and specialized software that assist observers in an aircraft to map forest health issues [1].While system improvements such as the more modern DMSM have increased the consistency and quality of the IDS data, they have not significantly reduced the number of hours spent flying.Time spent flying is an issue because conducting aerial surveys is both risky and costly.In 2010, an ADS pilot and two aerial surveyors were killed in an aviation accident [2].While the need for reporting and analysis of ADS data remains, efforts are being made to integrate ground collection and satellite remote sensing to improve safety and quality of IDS data.

Remote Sensing Augmentation to IDS
The Real-Time Forest Disturbance (RTFD) program was designed to detect and track rapid onset forest change events (e.g., rapid defoliation and storm driven changes) as well as slow onset forest change events (e.g., insect induced mortality) across the CONUS on a weekly basis during the growing season [3].The RTFD program serves as a near-real time alarm system to highlight forest disturbances in support of ADS mission planning by state and local forestry personnel.Because of the spatial coarseness of MODIS data and fixed parameters that are used to produce consistent RTFD products, the RTFD program does not necessarily yield spatially refined results that are sufficient to produce spatially explicit maps for the IDS database.In contrast, the goal of the ORS program is to augment the annually produced IDS data with a novel source of forest disturbance map products.To that end, satellite image data of higher spatial resolution relative to the MODIS-based RTFD program can be used, and change detection parameters can be fine-tuned for specific forest disturbance events.
ORS products do not contain disturbance agent information found in IDS data.The spatial resolution of the imagery used to generate data for ORS cannot compare to viewing forest disturbance phenomena from an airplane.Furthermore, no automated algorithm is commensurate to the experience accumulated over a career by an aerial detection surveyor observing the effects of pests and pathogens on forests.On the other hand, analysis performed using remote sensing imagery provides a consistent view of the landscape that is unobtainable by manually sketched maps by multiple interpreters.There are advantages and disadvantages associated with both sources of forest health spatial data.Therefore, the goal of the ORS program is to create forest change maps that provide a balance of spatial completeness, timeliness, and safety.
ORS data are produced as needed in response to a field request or an RTFD detection.ORS data may also be produced at the end of a growing season to supplement the annual IDS data stream.Thematic information contained in ORS maps includes non-tree, tree no change and tree change.

Change Detection Methods
The ORS program targets intra-seasonal change associated with insect defoliation and disease, as well as long-term decline in forest vigor associated with insects and pathogens.Prior to this work, many studies developed methods to meet different land cover change mapping/monitoring needs [4][5][6][7][8][9].Early remote sensing change detection methods generally utilized two-date image pairs in various techniques to detect change [5].These methods proved cost effective and computationally practical at the time.
After the Landsat archive became freely available to the general public in 2008, change detection methods that required annual or biennial time series image stacks of Landsat data became cost effective.Methods such as the Vegetation Change Tracker (VCT) [6] and Landsat-based detection of Trends in Disturbance and Recovery (LandTrendr) [7] were designed to detect and track changes that persist for more than a single growing season.Because VCT is designed to detect abrupt stand-clearing forest change events, it struggles to detect long-term and partial tree cover change events [9].Neither of these methods is sensitive to intra-annual change.
As computational power increased, methods that utilize all available data throughout the year were developed.Breaks for Additive Seasonal and Trend (BFAST) [8] and Continuous Change Detection and Classification (CCDC) [10] use harmonic regression to account for seasonal variation, leaving the remaining variance to perform change analysis.The ability of these algorithms to detect change is largely a function of the frequent availability of cloud/cloud shadow-free data.The frequency must be sufficient to capture the typical seasonality of an area.The one-day revisit frequency of MODIS is likely to provide the necessary temporal density.The 16-day revisit frequency of Landsat is likely insufficient in areas frequently obscured by clouds.
While VCT, LandTrendr, BFAST, and CCDC excel at finding land cover changes across large areas that persist for more than a single observation, finding subtle changes that often only persist for a portion of a single growing season generally requires a more targeted approach.These methods are often referred to as "near-real time" [3,11].Near-real time change detection methods generally compare analysis period observations to a stable baseline period.Because each observation in the analysis period is being compared to many observations in the baseline period, the algorithm can be more sensitive to the onset of subtle change.Some methods that utilize a baseline period and analysis period include the Z-score approach used in the USFS RTFD program [3], BFAST-Monitor [11], and Exponentially Weighted Moving Average Change Detection (EWMACD) [12].The harmonic regression approach used in the latter two of these methods underpin the harmonic Z-score approach applied in this research.
While utilizing EWMACD and/or BFAST-Monitor to meet the needs of the ORS program would be appropriate, ORS required algorithms that were operative within Google Earth Engine (GEE) by spring 2015.At that time, neither of these algorithms was available within GEE, so we built on the ideas from the RTFD Program, EWMACD, and BFAST-Monitor that were practical to implement with GEE.
To compare ORS outputs and other existing FHP spatial products, we created ORS outputs in two geographically disparate study areas from three targeted change detection algorithms that were developed for the ORS project.These include the basic Z-score, harmonic Z-score, and linear trend analysis change detection methods.We then used reference data acquired using TimeSync [13], a time series visualization and data collection tool, to conduct an accuracy assessment to better understand how accurate ORS outputs are compared to RTFD and IDS data products [14,15].

Study Areas
We chose two study areas to evaluate the ability of the ORS program's methods to map and monitor intra-seasonal and multi-year forest decline across different ecological settings (Table 1 and Figure 1).The Rio Grande National Forest study area captures a slow onset forest disturbance caused by mountain pine beetle (Dendroctonus ponderosae) and spruce bark beetle (Dendroctonus rufipennis).The Forest has been experiencing long-term gradual decline and mortality, with a total of 617,000 acres being infested by spruce beetles since 1996 [16].Inclusion of this study area presents an opportunity to test the ability of the change algorithms to discern new areas of forest disturbance from areas affected in previous years.
The Rhode Island, Massachusetts, and Connecticut (Southern New England) study area experienced extensive Gypsy Moth (Lymantria dispar dispar) defoliation events in 2016 and 2017.This region typifies many challenges associated with detecting ephemeral forest disturbances using remote sensing data.These challenges include the temporal variations in phenological conditions and persistent nature of cloud cover in the region.
Within each study area, we analyzed pixels that had a NLCD 2011 Tree Canopy Cover [17] value greater than 30 percent.Within each study area, we analyzed pixels that had a NLCD 2011 Tree Canopy Cover [17] value greater than 30 percent.
Figure 1.Map of study areas included in this project.

IDS-Data
We acquired IDS polygon data [1] for each study area.Surveyors collected these data using established aerial detection survey sketch mapping techniques by the USDA Forest Service.IDS data are used to estimate the spatial extents of forested areas disturbed by insects and diseases, as well as attribute the causal agents of these disturbances [1,18].They are conducted using light aircraft, as well as ground surveys.
IDS data originated from separate aerial and ground surveys performed in Massachusetts, Connecticut, and Rhode Island during 7 June-25 August 2016 and 26 June-26 August 2017 for the Southern New England study area.Rio Grande National Forest study area IDS data were created from a wall-to-wall aerial survey of forested areas in Colorado conducted during 30 July-18 September 2013 and 24 June-3 September 2014.

RTFD-Data
RTFD persistence change data were incorporated into this study to compare with spatial data created using ORS methods.RTFD data are created for the entire CONUS every 8 days from 18 February to 24 November using 16-day MODIS composites.The RTFD approach employs a combination of two separate remote sensing change detection approaches to detect and track quick ephemeral changes in forest health as well as gradually occurring disturbances [3].The first uses a statistical (z-score) change detection method for detecting quick intra-seasonal changes in forest conditions.The second approach uses linear regression to identify areas where slower, multiyear

IDS-Data
We acquired IDS polygon data [1] for each study area.Surveyors collected these data using established aerial detection survey sketch mapping techniques by the USDA Forest Service.IDS data are used to estimate the spatial extents of forested areas disturbed by insects and diseases, as well as attribute the causal agents of these disturbances [1,18].They are conducted using light aircraft, as well as ground surveys.
IDS data originated from separate aerial and ground surveys performed in Massachusetts, Connecticut, and Rhode Island during 7 June-25 August 2016 and 26 June-26 August 2017 for the Southern New England study area.Rio Grande National Forest study area IDS data were created from a wall-to-wall aerial survey of forested areas in Colorado conducted during 30 July-18 September 2013 and 24 June-3 September 2014.

RTFD-Data
RTFD persistence change data were incorporated into this study to compare with spatial data created using ORS methods.RTFD data are created for the entire CONUS every 8 days from 18 February to 24 November using 16-day MODIS composites.The RTFD approach employs a combination of two separate remote sensing change detection approaches to detect and track quick ephemeral changes in forest health as well as gradually occurring disturbances [3].The first uses a statistical (z-score) change detection method for detecting quick intra-seasonal changes in forest conditions.The second approach uses linear regression to identify areas where slower, multiyear changes are occurring in forested areas.Disturbances were characterized using the RTFD z-score change detection method in Southern New England while the linear trend method was employed over the Rio Grande National Forest.In all study areas, the RTFD persistence output was used to spatially depict forest disturbances.The persistence RTFD output combines the results of the last three 16-day compositing period RTFD change outputs.This added processing step minimizes noise in individual compositing period outputs arising from persistent cloud cover and other MODIS ephemera.The persistence output collapses continuous RTFD z-score and linear trend outputs into three classes of disturbance magnitude by combining the disturbance frequency over the last three compositing periods with disturbance intensity based on departure from normal forest conditions [3].The three disturbance magnitude classes were collapsed into binary disturbance maps for use in this study.
RTFD data used in this study were produced during 10 June-27 July 2016 and 26 June-12 August 2017 for the Southern New England study area and during 28 July-28 August for both analysis years for the Rio Grande National Forest study area.

ORS-Data
ORS methods were built using satellite image data that had been collected for at least 5 years and would likely be regularly available into the future.This includes imagery from Landsat Thematic Mapper (TM, 1984(TM, -2011)), Enhanced Thematic Mapper (ETM+, 1999-Present), and Operational Land Imager (OLI, 2013-Present).Each Landsat sensor has a revisit frequency of 16 days.Since Landsat 5 and 7 overlapped during 1999-2011 and Landsat 7 and 8 overlapped during 2013-present, these periods have a revisit frequency of about 8 days.All Landsat-based analyses were conducted at 30 m spatial resolution.
ORS methods were also applied to MODerate-resolution Imaging Spectroradiometer (MODIS) imagery collected from the Terra (1999-present) and Aqua (2002-present) satellites, which is available multiple times a day for most locations on Earth.MODIS imagery has a spectral resolution similar to Landsat, but coarser spatial resolution than Landsat.All MODIS-based analyses were conducted at 250 m spatial resolution.
This project utilized Landsat Ecosystem Disturbance Adaptive Processing System (LEDAPS) surface reflectance values from Landsat 5 and 7 [19], Landsat 8 Surface Reflectance Code (LaSRC) values for Landsat 8 [20], and MODIS Terra and Aqua 8-day surface reflectance composites.All data were accessed through Google Earth Engine, which acquires MODIS data from the NASA Land Processes Distributed Active Archive Center (LP DAAC), USGS/Earth Resources Observation and Science (EROS) Center, Sioux Falls, South Dakota, and Landsat data from EROS Center, Sioux Falls, South Dakota.LP DAAC MODIS collections used are found in Table 2.

Data Preparation
All data preparation and analysis were performed with Google Earth Engine (GEE).GEE provides access to most freely available earth observation data and provides an application programming interface (API) to analyze and visualize these data [21].
All Landsat images first underwent pixel-wise cloud and cloud shadow masking using the Google cloudScore algorithm for cloud masking and Temporal Dark Outlier Mask (TDOM) for cloud shadows.Since neither of these methods is currently documented in the peer-reviewed literature, they are briefly described below, and are fully described and evaluated in a forthcoming paper [22].MODIS 8-day composites are intended to be cloud-free, so no pre-processing was conducted.
The Google cloudScore algorithm exploits the spectral and thermal properties of clouds to identify and remove these artifacts from image data.To simplify the process, the cloudScore function uses a min/max normalization function to rescale expected reflectance or temperature values between 0 and 1 (Equation ( 1)).The algorithm applies the normalization function to five cloud spectral properties.The final cloudScore is the minimum of these normalized values (Equation ( 2)).The algorithm finds pixels that are bright and cold, but do not share the spectral properties of snow.Specifically, it defines the cloud score as: cloudScore landsat = min{1.0,normalize(blue, 0.1, 0.3), normalize(blue + green + red, 0.2, 0.8), normalize(nir Any pixel with a cloudScore value greater than 0.2 was identified as being a cloud.This value was qualitatively evaluated through extensive use throughout the CONUS to provide the best balance of detecting clouds, while not committing cold, bright surfaces.Any values inside the resulting mask were removed from subsequent steps of the analysis. After clouds were masked, cloud shadows were identified and masked.Since the Landsat archive is extensive across time, cloud shadows generally appear as anomalously dark pixels.GEE provides access to the entire archive of Landsat, enabling methods that require querying extensive time series of data.The Temporal Dark Outlier Mask (TDOM) method was used to identify pixels that are dark in relative and absolute terms.Generally, pixels will not always be obscured by a cloud shadow, making this method effective.Specifically: where M b and σ b are the mean and standard deviation, respectively, of a given band b across the time series, M is the multispectral Landsat image, t ZS is the threshold for the shadow z-score (−1 for this study), and t IR is the threshold for the darkness (0.35 for this study).
The TDOM method first computes the mean (M b ) and standard deviation (σ b ) of the near-infrared (NIR) and shortwave-infrared (SWIR1) bands across a collection of images.For each image, the algorithm then computes the z-score of the NIR and SWIR1 bands (z(M b )) (Equation ( 5)) (Figure 2).Each image also has a darkness metric computed as the sum of the NIR and SWIR1 bands (darkness).Cloud shadows are then identified if a pixel has a z-score of less than −1 for both the NIR and SWIR1 bands and a darkness value less than 0.35 (Equation ( 8)).These thresholds were chosen after extensive qualitative evaluation of TDOM outputs from across the CONUS.This study used the normalized difference vegetation index (NDVI) (Equation ( 6)) [23] and the normalized burn ratio (NBR) (Equation ( 7)) [24] as indices of forest greenness to detect forest changes.
where NIR is the near infrared band of a multispectral image, red is the red band of a multispectral image, and SWIR2 is the second shortwave infrared band of a multispectral image.Many additional common indices are available for use when running ORS algorithms, but were omitted from this study.These two indices are used since the bands necessary to create them are available for both MODIS and Landsat, they are very similar to the indices used in the RTFD program, and they have proven effective throughout the remote sensing change detection literature [6,7,11,23,24].

Change Detection Algorithms
Three algorithms were developed and tested in GEE using MODIS and Landsat data.Employing GEE for this study enabled us to efficiently test various change detection approaches using Landsat and MODIS image data archives, without downloading and storing the data locally.
ORS mapping needs fell into two categories: ephemeral change related to defoliation events and long-term change related to tree mortality from insects and disease.The three algorithms used in this study were selected to address the needs posed by these forest change types.We refer to these algorithms as: basic z-score (Figure 3), harmonic z-score (Figure 4), and linear trend (Figure 5).
The basic z-score method builds on ideas from RTFD, serving as a natural starting point for ORS mapping methods.The harmonic z-score method combines ideas from harmonic regression-based methods, such as EWMACD [12] and BFAST Monitor [11], with those from the basic z-score method to leverage data from throughout the growing season.The linear trend method is built on ideas from This study used the normalized difference vegetation index (NDVI) (Equation ( 6)) [23] and the normalized burn ratio (NBR) (Equation ( 7)) [24] as indices of forest greenness to detect forest changes.
where NIR is the near infrared band of a multispectral image, red is the red band of a multispectral image, and SWIR2 is the second shortwave infrared band of a multispectral image.Many additional common indices are available for use when running ORS algorithms, but were omitted from this study.These two indices are used since the bands necessary to create them are available for both MODIS and Landsat, they are very similar to the indices used in the RTFD program, and they have proven effective throughout the remote sensing change detection literature [6,7,11,23,24].

Change Detection Algorithms
Three algorithms were developed and tested in GEE using MODIS and Landsat data.Employing GEE for this study enabled us to efficiently test various change detection approaches using Landsat and MODIS image data archives, without downloading and storing the data locally.
ORS mapping needs fell into two categories: ephemeral change related to defoliation events and long-term change related to tree mortality from insects and disease.The three algorithms used in this study were selected to address the needs posed by these forest change types.We refer to these algorithms as: basic z-score (Figure 3), harmonic z-score (Figure 4), and linear trend (Figure 5).
The basic z-score method builds on ideas from RTFD, serving as a natural starting point for ORS mapping methods.The harmonic z-score method combines ideas from harmonic regression-based methods, such as EWMACD [12] and BFAST Monitor [11], with those from the basic z-score method to leverage data from throughout the growing season.The linear trend method is built on ideas from Image Trends from Regression Analysis (ITRA) [25] and the RTFD trend method [3].It uses linear regression to fit a line across a series of years.
Both the basic z-score and harmonic z-score method work by identifying pixels that differ from a baseline period.The primary difference is what data are used to compute the z-score.Both methods start by acquiring imagery for a baseline period (generally 3-5 years) and analysis years.The basic z-score method uses the mean and standard deviation of the baseline period within a targeted date range for a specified band or index, while the harmonic z-score method follows the EWMACD and BFAST Monitor methods by first fitting a harmonic regression model to all available cloud/cloud shadow-free Landsat or MODIS data.
The harmonic regression model is intended to mitigate the impact of seasonality on the spectral response, leaving remaining variation to be related to change unrelated to phenology.The harmonic is defined as: where b is the band or index of the image, a n b are the coefficients being fit by the model, and t is the sum of the date expressed as a year and the proportion of a year.
Image Trends from Regression Analysis (ITRA) [25] and the RTFD trend method [3].It uses linear regression to fit a line across a series of years.
Both the basic z-score and harmonic z-score method work by identifying pixels that differ from a baseline period.The primary difference is what data are used to compute the z-score.Both methods start by acquiring imagery for a baseline period (generally 3-5 years) and analysis years.The basic zscore method uses the mean and standard deviation of the baseline period within a targeted date range for a specified band or index, while the harmonic z-score method follows the EWMACD and BFAST Monitor methods by first fitting a harmonic regression model to all available cloud/cloud shadow-free Landsat or MODIS data.
The harmonic regression model is intended to mitigate the impact of seasonality on the spectral response, leaving remaining variation to be related to change unrelated to phenology.The harmonic is defined as: where b is the band or index of the image,  are the coefficients being fit by the model, and t is the sum of the date expressed as a year and the proportion of a year.The harmonic regression model is fitted on a pixel-wise basis to the baseline data and then applied to both the baseline and the analysis data.Next, the residual error is computed for both the baseline and analysis data.Up to this point, the harmonic z-score method, EWMACD, and BFAST Monitor methods are largely the same.The primary difference with these three methods is how change is classified.While EWMACD and BFAST Monitor use exponentially weighted moving average (EWMA) charting and moving sum (MOSUM) of the residuals, respectively [11,12], the harmonic z-score method uses the mean and standard deviation of the targeted date period baseline residuals to compute the z-score of the targeted analysis period residuals.The date range and specific years used to define the baseline can be tailored to optimize the discernment of specific disturbances within the analysis years.To capture the seasonality of a pixel, this approach uses data from the entire year.A harmonic regression model is first fit to the baseline data (2011-2015 in this example).That same harmonic model is then applied to both the baseline and analysis (2016 in this example) data.The mean and standard deviation of the residual error from the baseline observations within the specified date range are then computed.These statistics are then applied to the residual error of the analysis data to find departures from expected seasonality.Finally, residual z-score values for the analysis data within the specified date range is summarized and thresholded to classify change.
Since an analysis period may have more than a single observation, a method for summarizing these values is specified for both the basic and harmonic z-score methods to constrain the final zscore value.For this study, the mean of values was used.Change is then identified by thresholding the summary z-score value.For this project, summarized z-score values less than −0.8 were identified as change.This threshold was chosen based on analyst expertise obtained from iterative qualitative comparison of z-score results with post-disturbance image data.
The linear trend method makes use of the median of the cloud and cloud shadow-free observations available within a specified target date period.This is done for a specified number of years prior to the analysis year, referred to as an epoch.For an epoch, an ordinary least square linear regression model is fit on a pixel-wise basis as follows: where b is the band or index of image, a0 is the intercept, a1 is the slope, t is the date, and y is the predicted value.
Change is identified where the slope (a1) is less than a specified threshold.The threshold used for this study was −0.03.This was chosen based on analyst expertise similar to the identification of the threshold used for the z-score methods.This method differs from ITRA [25] since it does not use a t-test to classify change.To capture the seasonality of a pixel, this approach uses data from the entire year.A harmonic regression model is first fit to the baseline data (2011-2015 in this example).That same harmonic model is then applied to both the baseline and analysis (2016 in this example) data.The mean and standard deviation of the residual error from the baseline observations within the specified date range are then computed.These statistics are then applied to the residual error of the analysis data to find departures from expected seasonality.Finally, residual z-score values for the analysis data within the specified date range is summarized and thresholded to classify change.
The harmonic regression model is fitted on a pixel-wise basis to the baseline data and then applied to both the baseline and the analysis data.Next, the residual error is computed for both the baseline and analysis data.Up to this point, the harmonic z-score method, EWMACD, and BFAST Monitor methods are largely the same.The primary difference with these three methods is how change is classified.While EWMACD and BFAST Monitor use exponentially weighted moving average (EWMA) charting and moving sum (MOSUM) of the residuals, respectively [11,12], the harmonic z-score method uses the mean and standard deviation of the targeted date period baseline residuals to compute the z-score of the targeted analysis period residuals.The date range and specific years used to define the baseline can be tailored to optimize the discernment of specific disturbances within the analysis years.
Since an analysis period may have more than a single observation, a method for summarizing these values is specified for both the basic and harmonic z-score methods to constrain the final z-score value.For this study, the mean of values was used.Change is then identified by thresholding the summary z-score value.For this project, summarized z-score values less than −0.8 were identified as change.This threshold was chosen based on analyst expertise obtained from iterative qualitative comparison of z-score results with post-disturbance image data.
The linear trend method makes use of the median of the cloud and cloud shadow-free observations available within a specified target date period.This is done for a specified number of years prior to the analysis year, referred to as an epoch.For an epoch, an ordinary least square linear regression model is fit on a pixel-wise basis as follows: where b is the band or index of image, a 0 is the intercept, a 1 is the slope, t is the date, and y is the predicted value.
Change is identified where the slope (a 1 ) is less than a specified threshold.The threshold used for this study was −0.03.This was chosen based on analyst expertise similar to the identification of the threshold used for the z-score methods.This method differs from ITRA [25] since it does not use a t-test to classify change.ORS methods differ from existing RTFD methods in several key areas.Firstly, RTFD products are created in a near-real time environment that GEE cannot currently provide.The composites that are used are therefore slightly different than those used in ORS.Secondly, all baseline statistics are computed on a pixel-wise basis for ORS methods, while they are computed on a zone-wise basis for RTFD methods-where the zones are defined by the combination of USGS mapping zone, forest type, and MODIS look angle strata [3].The most pronounced difference, however, is that all model input parameters for RTFD are fixed, while ORS parameters can be tailored to a specific forest disturbance event by an expert user.
We tested all three methods in both study areas.Analysts chose targeted date ranges based on expert knowledge of when each event was most visible.For the Southern New England study area, ORS analyses were conducted between 25 May and 9 July for both 2016 and 2017.Baseline years spanned 2011-2015 for harmonic z-score and regular z-score methods while a three-year epoch length was used for the linear trend method.For the Rio Grande National Forest study area, ORS analyses were conducted from 9 July to 15 October for 2013 and 2014.Baseline years spanned 2007-2011 for harmonic z-score and regular z-score methods while a five-year epoch length was used for the linear trend method.

Accuracy Assessment Methods
We performed an independent accuracy assessment to understand how well ORS products performed relative to existing FHP disturbance mapping programs.We followed best practices for sample design, response design, and analysis as outlined by Olofsson et al. ( 2014) [26] and Pontius and Millones (2011) [27].We drew a simple random sample across all 30 m × 30 m pixels that were within each study area's tree mask.Since all change detection for this study was performed ORS methods differ from existing RTFD methods in several key areas.Firstly, RTFD products are created in a near-real time environment that GEE cannot currently provide.The composites that are used are therefore slightly different than those used in ORS.Secondly, all baseline statistics are computed on a pixel-wise basis for ORS methods, while they are computed on a zone-wise basis for RTFD methods-where the zones are defined by the combination of USGS mapping zone, forest type, and MODIS look angle strata [3].The most pronounced difference, however, is that all model input parameters for RTFD are fixed, while ORS parameters can be tailored to a specific forest disturbance event by an expert user.
We tested all three methods in both study areas.Analysts chose targeted date ranges based on expert knowledge of when each event was most visible.For the Southern New England study area, ORS analyses were conducted between 25 May and 9 July for both 2016 and 2017.Baseline years spanned 2011-2015 for harmonic z-score and regular z-score methods while a three-year epoch length was used for the linear trend method.For the Rio Grande National Forest study area, ORS analyses were conducted from 9 July to 15 October for 2013 and 2014.Baseline years spanned 2007-2011 for harmonic z-score and regular z-score methods while a five-year epoch length was used for the linear trend method.

Accuracy Assessment Methods
We performed an independent accuracy assessment to understand how well ORS products performed relative to existing FHP disturbance mapping programs.We followed best practices for sample design, response design, and analysis as outlined by Olofsson et al. (2014) [26] and Pontius and Millones (2011) [27].We drew a simple random sample across all 30 m × 30 m pixels that were within each study area's tree mask.Since all change detection for this study was performed retrospectively, timely field reference data could not be collected.Instead, we collected independent reference data using TimeSync [13].TimeSync is a tool that enables a consistent manual inspection of the Landsat time series along with high resolution imagery found within Google Earth.Single Landsat pixel-size (30 m × 30 m) plots were analyzed throughout the baseline and analysis periods.The response design was created for the US Forest Service Landscape Change Monitoring System (LCMS) [28] and USGS Landscape Change Monitoring Assessment and Projection (LCMAP) [29] projects to provide consistent depictions of land cover, land use, and change process.Rigorous analyst training and calibration was used to overcome the subjective nature of analyzing data in this manner.We analyzed 230 plots across the Rio Grande National Forest and 416 plots across the Southern New England study area.We then cross-walked all responses for each year to change and no change.For consistency, all ORS, IDS, and RTFD outputs had the same 30 m spatial resolution tree mask used for drawing the reference sample applied to them.All MODIS-based ORS and RTFD outputs were resampled from 250 m to 30 m spatial resolution using nearest neighbor resampling.The reference data was then compared to each of the ORS outputs along with the IDS and RTFD products for each analysis year.Accuracy metrics follow suggestions by Pontius and Millones (2011) [27].They include those related to allocation disagreement (overall accuracy, and class-wise omission and commission error rates) and quantity disagreement (reference and predicted prevalence).The 5th and 95th percentile confidence interval was calculated using the overall accuracy of 500 bootstrap random samples for each assessed output.While these methods provide a depiction of the accuracy of the assessed change detection methods within the tree mask, it completely omits any areas outside of this mask from all analyses.

Results
The reference dataset produced using TimeSync for each analysis year was compared to each of the 12 ORS outputs, as well as the RTFD and IDS outputs.The 12 ORS outputs reflect the combination of basic z-score, harmonic z-score, and trend analysis methods, calculated using either NDVI or NBR greenness indices, and derived from either Landsat or MODIS data.

Southern New England
The overall accuracy of all outputs ranges from 84.86% to 92.55% in 2016 and 71.63% to 87.02% in 2017 (Tables 3 and 4).The overall accuracies are largely insignificantly different from one another with respect to the 5th and 95th percentile confidence intervals.The proportion of plots the reference data indicates has changed is sometimes similar to the map outputs while others are different in relative and absolute terms.This result can be found in the tree change prevalence columns in Tables 3  and 4. The highest overall accuracy and lowest commission, omission, and prevalence error rates are highlighted in each table.The ORS trend algorithm has the highest overall accuracy and lowest change prevalence error for both 2016 and 2017.Omission and commission error rates are quite varied and inconsistent across different combinations of algorithm, platform, and index.Many ORS outputs perform as well or better than IDS and RTFD outputs for both years.
The patterns of spatial agreement are evident in Figure 6, where a heat map of the number of ORS outputs that found a pixel as change are displayed with the corresponding IDS and RTFD (z-score) maps.Overall spatial patterns are similar between each product.There is a notable similarity between the RTFD maps and MODIS-based ORS maps.This reflects the similarities between the source data and methods used to derive these maps.

Rio Grande National Forest
The southwestern United States, including the Rio Grande National Forest study area, has been impacted for the last 20 years by prolonged drought conditions, contributing to slow-onset insect infestations including mountain pine beetle and spruce bark beetle [24,28].This study focused on the change that occurred in 2013 and 2014.
The overall accuracy of products obtained for the Rio Grande study area are lower than the Southern New England study area-ranging from 63.48% to 79.13% in 2013 and 66.09% to 76.09% in 2014 (Tables 5 and 6).Similar to the results from Southern New England, the overall accuracies are largely insignificantly different.The tree change prevalence error is a bit higher for most outputs in the Rio Grande study area: it is underestimated by as much as 26.09% (88.23% quantity error) and overestimated by as much as 22.61%.In general, the linear trend ORS method performed the best for both the 2013 and 2014 outputs.This is not surprising due to the gradual nature of tree decline and mortality caused by the bark beetle activity that has been occurring in this study area.
The heat map in Figure 7 shows different patterns of spatial agreement and disagreement than we observed in the Southern New England study area.The first notable difference is that both RTFD and ORS methods continue to capture areas that changed in both 2013 and 2014, while IDS maps only show new areas of change in 2014.This difference is inherent in the underlying methods, as IDS will often not remap previously existing long-term change.The second notable difference is that the area of new tree change in 2014 in the northeast portion of the study area was largely missed by RTFD, but was captured by both IDS and many ORS methods.

Discussion
This study improves our understanding of how well ORS algorithms perform compared to existing FHP mapping programs and provides an idea of how well these algorithms perform relative to other similar algorithms.
All methods have similar overall accuracies and highly variable omission and commission error rates.Placing this study in the context of recent forest change detection research, some similarity in results can be found.Cohen et al. (2017) [15] provided an accuracy assessment of several change algorithms, while Brooks et al. (2017) [30] examined EWMACD and a newer adaptation of EWMACD called dynamic EWMACD (Edyn).Both studies use TimeSync to gather reference data.Cohen et al. (2017) found omission and commission error rates for EWMACD (similar to ORS harmonic z-score method) and ITRA (similar to ORS trend method) are greater than 70%.This is a higher error rate than most results in this study.Brooks et al. (2017) find EWMACD has a 65.2% omission error rate and 39.9% commission error rate [30].
While the omission and commission error rates seem to generally be high when using reference data from TimeSync, Cohen et al. (2017) [15] assert that the level of disagreement between the TimeSync-based reference data and outputs from automated algorithms can occur for many reasons.They acknowledge that change detection methods are now trying to detect more subtle changes and therefore TimeSync-based response designs (such as the one used in this study) now include lower magnitude disturbances.TimeSync's strength is that it relies on utilizing more than just Landsat data, and leverages a human interpreter's ability to discern subtle changes.They recognize this leads to higher commission/omission error rates than earlier change detection accuracy assessments.This is not to say TimeSync reference data over-estimate change, but rather highlights that individual automated change detection algorithms can only attain a finite level of sensitivity to change.
While the results obtained in this study are comparable to those obtained in other change detection studies, we must also evaluate the accuracy of these map outputs relative to each other.Various accuracy measures demonstrate that ORS outputs often perform comparably or better than IDS or RTFD products.Since there are several ORS outputs that were out-performed by IDS and/or RTFD outputs, the importance of choosing the best ORS method and parameters for a given mapping task is evident.Future work to quantitatively optimize the parameters used in ORS algorithms will be conducted to address this need.
In the meantime, ORS outputs will continue to be qualitatively calibrated.This may call into question the utility of the ORS program in its current form.While the importance of quantitative calibration cannot be understated, the ORS program has been qualitatively calibrating these algorithms for three growing seasons.Our experience indicates that the harmonic z-score algorithm is very difficult to qualitatively calibrate and produces highly inconsistent results.Both the trend and regular z-score methods have consistently produced satisfactory results.These findings largely agree with the results of this study.
Despite having coarser spatial resolution, MODIS-based outputs often out-performed equivalent Landsat-based outputs.This likely is related to the differences between their temporal resolution.The limited availability of cloud/cloud shadow-free Landsat images hinders its utility to consistently detect ephemeral forest disturbances that must be captured within narrow time windows.The frequent observations from MODIS permit a much greater opportunity to detect such disturbances.
While largely unapparent in the accuracy assessment results, IDS data exhibit unique characteristics not found in RTFD or ORS outputs.Firstly, whereas the ORS and RTFD change data are derived from space-borne satellite imagery from Landsat and MODIS platforms, the IDS data are derived from observations made by state and local forestry personnel from ground surveys or an aerial vehicle (e.g., fixed-wing aircraft or helicopter).Secondly, there is a range of methods used to compile IDS data on a state-by-state basis.The inconsistent methods employed to produce IDS data on a local level are evident in the Southern New England study area, where large polygons are favored in Massachusetts, smaller polygons are more common in Connecticut, and grid cells are the primary spatial unit in Rhode Island (Figure 8).In contrast, the results obtained from forest change detection algorithms using satellite imagery do not vary from place to place, but rather maintain an objective quality over space and time.This is likely why varying levels of agreement were seen between IDS data and the ORS and RTFD spatial results within the two study areas.
Remote Sens. 2018, 10, x FOR PEER REVIEW 18 of 21 detection algorithms using satellite imagery do not vary from place to place, but rather maintain an objective quality over space and time.This is likely why varying levels of agreement were seen between IDS data and the ORS and RTFD spatial results within the two study areas.This study did not use Sentinel-2 Multispectral Instrument data, as their availability remains sparse over North America [31].These data are also freely available, regularly collected, and have similar spatial resolution and spectral response functions to those of Landsat [32].The spatial and spectral similarity of Sentinel-2 and Landsat suggests that these image data could be combined for hybrid analyses.Such hybrid approaches would increase the revisit frequency from every eight days with Landsat 7 and 8 to a range of everyday to every five days.Future work that incorporates these data to create ORS products will likely improve the quality of non-MODIS-based outputs by providing more temporally dense data to detect and track ephemeral forest disturbances.

Conclusions
This study tested three algorithms developed to map forest change for the USDA Forest Service's FHP.Utilizing GEE as a platform to create these spatial data products was efficient and effective.ORS, IDS, and RTFD forest change products exhibited similar overall accuracy, and omission and commission error rates to other studies that used TimeSync to assess forest change products.The ORS spatial data often had similar or higher accuracy than existing FHP mapping program outputs.A unique strength of ORS spatial data lies in the adaptive nature of the algorithms employed.In contrast to the standardized method employed to create data for the RTFD program, the ability to adjust model parameters based on the characteristics of specific disturbance events permits the tuning of ORS spatial products.Additionally, the flexibility to use either Landsat scale or MODIS satellite image data in ORS algorithms permits the collection of timely synoptic observations regardless of the ephemeral nature of intra-seasonal forest disturbances.
While this study highlights that assessed FHP products have high error rates, real-world use has demonstrated their utility for spatially characterizing ephemeral and long term forest disturbances.ORS data have been used to augment the 2017 annual IDS database in the Southern New England This study did not use Sentinel-2 Multispectral Instrument data, as their availability remains sparse over North America [31].These data are also freely available, regularly collected, and have similar spatial resolution and spectral response functions to those of Landsat [32].The spatial and spectral similarity of Sentinel-2 and Landsat suggests that these image data could be combined for hybrid analyses.Such hybrid approaches would increase the revisit frequency from every eight days with Landsat 7 and 8 to a range of everyday to every five days.Future work that incorporates these data to create ORS products will likely improve the quality of non-MODIS-based outputs by providing more temporally dense data to detect and track ephemeral forest disturbances.

Conclusions
This study tested three algorithms developed to map forest change for the USDA Forest Service's FHP.Utilizing GEE as a platform to create these spatial data products was efficient and effective.ORS, IDS, and RTFD forest change products exhibited similar overall accuracy, and omission and commission error rates to other studies that used TimeSync to assess forest change products.The ORS spatial data often had similar or higher accuracy than existing FHP mapping program outputs.A unique strength of ORS spatial data lies in the adaptive nature of the algorithms employed.In contrast to the standardized method employed to create data for the RTFD program, the ability to adjust model parameters based on the characteristics of specific disturbance events permits the tuning of ORS spatial products.Additionally, the flexibility to use either Landsat scale or MODIS satellite image data in ORS algorithms permits the collection of timely synoptic observations regardless of the ephemeral nature of intra-seasonal forest disturbances.
While this study highlights that assessed FHP products have high error rates, real-world use has demonstrated their utility for spatially characterizing ephemeral and long term forest disturbances.ORS data have been used to augment the 2017 annual IDS database in the Southern New England study area for disturbances that occurred after the annual ADS had concluded [33].In future years, data from ORS models will also be used to respond to local disturbance reports from counties, states, and forest units.

Figure 1 .
Figure 1.Map of study areas included in this project.

Figure 2 .
Figure 2. Graph of SWIR1 band across time.Blue dots represent observed SWIR1 values.The green line is the mean.The red line is −1 standard deviation.TDOM would identify any value below the red line as a dark outlier in the SWIR1 band.This same analysis is then performed in the NIR band to identify dark outliers.

Figure 2 .
Figure 2. Graph of SWIR1 band across time.Blue dots represent observed SWIR1 values.The green line is the mean.The red line is −1 standard deviation.TDOM would identify any value below the red line as a dark outlier in the SWIR1 band.This same analysis is then performed in the NIR band to identify dark outliers.

Figure 3 .Figure 3 .
Figure3.Illustration of the basic z-score approach.First observations are limited to a specified date period (1 July-31 August in this example).The mean and standard deviation of the baseline observations (2011-2015 in this example) are then computed.These statistics are then applied to the analysis year (2016 in this example) to find departures from the baseline.

Figure 4 .
Figure 4. Illustration of the harmonic z-score change detection approach.To capture the seasonality of a pixel, this approach uses data from the entire year.A harmonic regression model is first fit to the baseline data (2011-2015 in this example).That same harmonic model is then applied to both the baseline and analysis (2016 in this example) data.The mean and standard deviation of the residual error from the baseline observations within the specified date range are then computed.These statistics are then applied to the residual error of the analysis data to find departures from expected seasonality.Finally, residual z-score values for the analysis data within the specified date range is summarized and thresholded to classify change.

Figure 4 .
Figure 4. Illustration of the harmonic z-score change detection approach.To capture the seasonality of a pixel, this approach uses data from the entire year.A harmonic regression model is first fit to the baseline data (2011-2015 in this example).That same harmonic model is then applied to both the baseline and analysis (2016 in this example) data.The mean and standard deviation of the residual error from the baseline observations within the specified date range are then computed.These statistics are then applied to the residual error of the analysis data to find departures from expected seasonality.Finally, residual z-score values for the analysis data within the specified date range is summarized and thresholded to classify change.

21 Figure 5 .
Figure 5. Illustration of the trend change detection algorithm.First observations are limited to the median of a specified date period (1 July-31 August in this example).Then, for a given epoch (2012-2016 in this example), an ordinary least squares linear regression model is fit to the annual median pixel value.The slope of this line is then thresholded to find change areas.

Figure 5 .
Figure 5. Illustration of the trend change detection algorithm.First observations are limited to the median of a specified date period (1 July-31 August in this example).Then, for a given epoch (2012-2016 in this example), an ordinary least squares linear regression model is fit to the annual median pixel value.The slope of this line is then thresholded to find change areas.

Figure 6 .
Figure 6.Map panels showing the extent of gypsy moth defoliation within the Southern New England study area in 2016 and 2017 as depicted by the RTFD analytical output, the IDS database, and ORS MODIS and Landsat analyses.ORS outputs are presented as heat maps of the number of outputs that found the area as change (1-6).

Figure 6 .
Figure 6.Map panels showing the extent of gypsy moth defoliation within the Southern New England study area in 2016 and 2017 as depicted by the RTFD analytical output, the IDS database, and ORS MODIS and Landsat analyses.ORS outputs are presented as heat maps of the number of outputs that found the area as change (1-6).

Figure 7 .
Figure 7. Map panels showing the extent of mountain pine beetle and spruce budworm damage within the Rio Grande National Forest study area in Colorado in 2013 and 2014 as depicted by the RTFD analytical output, the IDS database, and ORS MODIS and Landsat analyses.ORS outputs are presented as heat maps of the number of outputs that found the area as change (1-6).

Figure 7 .
Figure 7. Map panels showing the extent of mountain pine beetle and spruce budworm damage within the Rio Grande National Forest study area in Colorado in 2013 and 2014 as depicted by the RTFD analytical output, the IDS database, and ORS MODIS and Landsat analyses.ORS outputs are presented as heat maps of the number of outputs that found the area as change (1-6).

Figure 8 .
Figure 8. Map of the 2017 IDS spatial data used for the Southern New England study area depicting three different methods of IDS delineation-large polygons in MA, small polygons in CT, and grid cells in RI.

Figure 8 .
Figure 8. Map of the 2017 IDS spatial data used for the Southern New England study area depicting three different methods of IDS delineation-large polygons in MA, small polygons in CT, and grid cells in RI.

Table 1 .
Descriptions of the two study areas used in this work.

Table 2 .
The MODIS image data collections used in this study, as well as the spatial resolutions of these collections.Additionally, the bands used in this study are listed.The bandwidth of the SWIR1 band is 1230-1250 nm, and the bandwidth of the SWIR2 band is 1628-1652 nm.

Table 3 .
Southern New England overall accuracy with 5% and 95% confidence intervals, omission/commission error rates, and prevalence error for each ORS method tested as well as existing FHP program outputs (IDS and RTFD) for 2016.The highest overall accuracy and lowest commission, omission, and prevalence error rates are highlighted in bold.

Table 4 .
Southern New England overall accuracy with 5% and 95% confidence intervals, omission/commission error rates, and prevalence error for each ORS method tested as well as existing FHP program outputs (IDS and RTFD) for 2017.The highest overall accuracy and lowest commission, omission, and prevalence error rates are highlighted in bold.