Astrape: A System for Mapping Severe Abiotic Forest Disturbances Using High Spatial Resolution Satellite Imagery and Unsupervised Classiﬁcation

: Severe forest disturbance events are becoming more common due to climate change and many forest managers rely heavily upon airborne surveys to map damage. However, when the damage is extensive, airborne assets are in high demand and it can take managers several weeks to account for the damage, delaying important management actions. While some satellite-based systems exist to help with this process, their spatial resolution or latency can be too large for the needs of managers, as evidenced by the continued use of airborne imaging. Here, we present a new, operational-focused system capable of leveraging high spatial and temporal resolution Sentinel-2 and Planet Dove imagery to support the mapping process. This system, which we have named Astrape (“ah-STRAH-pee”), uses recently developed techniques in image segmentation and machine learning to produce maps of damage in different forest types and regions without requiring ground data, greatly reducing the need for potentially dangerous airborne surveys and ground sampling needed to accurately quantify severe damage. Although some limited ﬁeld work is required to verify results, similar to current operational systems, Astrape-produced maps achieved 78–86% accuracy with respect to damage severity when evaluated against reference data. We present the Astrape framework and demonstrate its ﬂexibility and potential with four case studies depicting four different disturbance types—ﬁre, hurricane, derecho and tornado—in three disparate regions of the United States. Astrape is capable of leveraging various sources of satellite imagery and offers an efﬁcient, ﬂexible and economical option for mapping severe damage in forests.


Introduction
The frequency and severity of storms and wildfires has increased in the past decade and there is evidence linking these increases to climate change [1][2][3][4][5][6]. This has direct implications for forest management because the risk of forest timber loss also increases as storm frequency and/or severity increases. Aerial surveys are frequently used to map forest damage resulting from severe disturbance events in the United States [7]. While effective, these flights can be dangerous, costly and time consuming to cover affected areas that may be thousands of hectares in size and widely dispersed. In recent years, satellite imagery has been used for mapping purposes; however, the spatial and/or temporal resolution of these systems may not always be sufficient for the needs of forest managers.
A large number of satellite systems are already operational in many parts of the U.S. (Table 1). These remote sensing approaches have greatly expanded the tools available to incident response, but there remains a need for a flexible, region-and disturbance-agnostic damage mapping system that is able to leverage high spatial resolution imagery (defined here as pixel size <10 m) for use by forest managers. Heat map of NDVI change at 250 m spatial resolution High temporal resolution-able to help direct further initial response efforts. Not optimized for fine-spatial-scale operational mapping.
HiForm [9] Landsat Sentinel-2 Image compositing to create a pre-and post-event image. Calculates a continuous scale of departure from average NDVI values (GEE).
Heat map of NDVI change at 10 or 30 m spatial resolution, depending on image source.
Potential to inform initial response efforts, but may exhibit long lag time due to clouds/image collection necessary for compositing.
ORS [10] MODIS Landsat Time-series analysis using three methods: basic z-score, harmonic z-score and linear trend (GEE; may be produced upon request*).
Heat map of change at 30 m spatial resolution.
Ability to adjust model parameters based on characteristics of specific disturbance events, permitting the tuning of ORS products. Able to provide intra-seasonal results, but may exhibit long lag time and/or coarse spatial resolution. Requires in-depth knowledge of methods and parameters.
LandTrendr [11] Landsat Analyzes the spectral response over a time series, smoothing anomalies unless the sudden departure from the trend line is sustained (GEE).
Maps of sudden disturbances and subsequent recovery at 30 m spatial resolution.
Optimized for monitoring recovery after a sudden disturbance event. Not intended as a tool for initial response efforts.
Global Forest Change [12] Landsat Time-series analysis to identify stand-replacing disturbances and reforestation on a global-level (Web app; produced internally).
Data with locations of stand-replacing loss and canopy gain at 30 m spatial resolution Global coverage of tree canopy gain and total loss. Does not report less severe damage. Lag time over a year. Not intended as a tool for initial response efforts.
Intended to inform initial management and response efforts. Only calibrated for wildfires in the western regions of the USA.
Acronyms: MODIS-Moderate Resolution Imaging Spectroradiometer; ORS-operational remote sensing; RAVG-rapid assessment of vegetation condition after wildfire; BARC-burned area reflectance classification; GTAC-Geospatial Technology and Applications Center; NDVI-normalized difference vegetation index; NBR-normalized burn ratio; RdNBR-relative difference NBR; GEE-Google Earth Engine; USFS-United States Forest Service.
Although most algorithms in Table 1 exploit Landsat data, there is great disparity in the efficacy of different approaches. Cohen et al. [15] compared the performance of seven algorithms on disturbance events using multi-year image stacks of Landsat data and found that all seven algorithms agreed on only 3% of the disturbances within a given year, indicating significant differences across methods. This was not surprising given the different disturbance types and locations on which these algorithms were developed, but it does suggest that the choice of algorithm is not straightforward and error rates during operational use may be higher than generally assumed.
In this application, temporal resolution is also an important consideration. Once damaged, trees become susceptible to insect infestation and fungal decay, factors that if allowed to progress will negatively impact the sale value of the timber. Responding foresters need disturbance extent and severity information as soon as possible to know where salvage efforts should be prioritized. A higher revisit rate offers more opportunities to capture cloud-free imagery when conditions are good. Two of the most promising highspatial resolution satellite sources for forest disturbance mapping are Sentinel-2 (revisit time 2-10 days, based upon latitude) and Planet's Dove satellites (near-daily revisit time) [16,17].
Sentinel-2 is a constellation of two multispectral satellites developed by the European Space Agency (ESA) with 13 bands that include three visible (RGB) and one near-infrared (NIR) band at 10 m spatial resolution, four additional NIR and red edge bands and two short-wave infrared (SWIR) bands at 20 m, and an aerosol band and 2 SWIR bands at 60 m [18]. Planet's extensive Dove constellation of >100 CubeSats is limited to three RGB bands and a NIR band, provided as 3 m imagery products. The recent development of HiForm (Table 1) has begun to fill the gap for systems that use high-spatial resolution imagery and can be applied to multiple regions, but is limited to Sentinel-2 imagery. Dalponte et. al. [19] leveraged both Dove and Sentinel-2 imagery to successfully map wind damage in a mountainous region of Italy and found similar results with both sources, but the approach required significant user input and was not tested across different disturbance types or regions.
Aside from Sentinel-2 and Dove imagery, other high-spatial resolution sensors either have a long revisit time, or are tasked-based (i.e., their collection is not ubiquitous). As a consequence, studies using sources of high-spatial resolution imagery to map severe damage in forests are uncommon and tend to focus on a single region or event [19][20][21][22]. More common are studies that use both satellite and airborne imagery to seek answers specific to questions about disturbances caused by pest or pathogen spread, or impacts such as drought [23][24][25][26]. Often, these studies also leverage site-specific ancillary data. For example, Meng et. al. [27] successfully used two dates of Worldview-2 imagery to map burn severity in a 432 ha area, but required significant preprocessing and fusion with airborne data from NASA's Goddard LiDAR, Hyper-spectral and Thermal (G-LiHT) platform. The combination of task-based image acquisition and the need for site-specific ancillary data make these sources of imagery unlikely to be used in a general operational-sense in the near term.
Here, we outline a general operational-focused system we have named Astrape ("a-STRAH-pee") that exploits high-spatial resolution imagery with a new approach to map various severe disturbance events across different regions and forest types (in Greek mythology, Astrape is a goddess associated with lightening and storms). In developing Astrape, we assumed prior research has sufficiently demonstrated the efficacy of using Sentinel-2 and Dove imagery to map severe damage [14,19,21,22] and instead focus on assimilating that knowledge with new techniques. We include seven key characteristics in our approach:
Automated classification using machine-learning classification (here, we tie together Jenks Natural Breaks and Extreme Gradient Boosting); 3.
Use of widely available ancillary data for topographical correction and masking (here, we leverage data available for the United States, described in Section 2.1.2, but country-specific or global datasets could be used); 4.
Built-in flexibility to account for different area of interest (AOI) sizes in different regions and forest types; 5.
Ability to map multiple severe disturbance agents; Uses only two images (pre-and post-event) instead of time-series data to minimize production time.
To the best of our knowledge, no study has attempted to build such a system with all of these constraints. Notably, we did not presume to offer exact estimates of damage as this would be difficult to determine across different forest and disturbance types given their disparate reflectance characteristics. Similar to HiForm and ORS, Atrape-produced maps would still require a limited amount of ground-verification, but this dependence on field sampling would be greatly reduced and, by extension, the time, costs and risks associated with it would also be mitigated. With this in mind, we aimed to enable quick management decisions and allocation of limited resources.

Materials and Methods
Astrape consists of two modules: 1) an Image Segmentation and Differencing module and 2) an Automated Classification module ( Figure 1). Astrape was developed such that user input is very minimal and requires only knowledge on how to obtain quality imagery, stack it, conduct initial masking (clouds, bad pixels, etc.) and perform any radiometric correction (often needed for Dove imagery mosaics). All other preprocessing, segmentation and classification methods described below are built into Astrape and performed automatically. All Sentinel-2 imagery and recommended ancillary data (described below) are freely available and easy to access. Dove imagery, however, is proprietary and access will depend on the user's organizational access. All Sentinel-2 and Dove imagery products downloaded were already corrected to surface reflectance (Level 2A and 3B, respectively). Imagery bands used in this study are provided in Table 2 and were chosen to aid in the segmentation process and facilitate the calculation of the vegetation indices described in Section 2.1.2. Astrape is written in Python using open-source libraries that include, most notably, jenkspy, rsgislib, sklearn and xgboost.

Module Overview
Astrape is a geographic object-based image analysis (GEOBIA) system. GEOBIA focuses on using objects on the landscape rather than pixels to extract valuable information [28,29]. Often used with high-spatial resolution imagery, GEOBIA can reduce noise  Astrape is a geographic object-based image analysis (GEOBIA) system. GEOBIA focuses on using objects on the landscape rather than pixels to extract valuable information [28,29]. Often used with high-spatial resolution imagery, GEOBIA can reduce noise (speckle) and improve classification results, especially when the parameters for the segmentation algorithm are optimized [30][31][32][33][34][35][36][37][38][39][40][41][42]. Here, we use GEOBIA to locate areas with varying degrees of change (i.e., damage). This differs from many GEOBIA-based studies in the literature that attempt to identify discrete entities like buildings, individual trees, or contiguous forest stands of common composition. By using GEOBIA in this way, we aim to produce discrete boundaries of damage, thereby enabling foresters with a quick way to mark areas for harvest or other management options. Having reasonable boundaries already created for them translates into reduced time and cost doing this boundary delineation in the field or in the lab, using a post-event image. This, in turn, translates to salvaging the damaged timber sooner, improving the returns on the salvage.

Preprocessing and Ancillary Data
The image segmentation and differencing module begins with pre-processing steps that include cropping the imagery to the area of interest (AOI), topographic correction, calculation of VIs and image masking. Topographic correction is performed on the original images following Soenen et al. [43] using tiles from the seamless 1/3 arc-second digital elevation model (DEM, freely available at https://viewer.nationalmap.gov accessed on 15 January 2021), which covers the entire United States. Astrape automatically adjusts the coordinate reference system and spatial resolution of the DEM as necessary to conform to the imagery and applies the topographic correction methodology.
Instead of thresholding a single VI, such as the normalized difference vegetation index (NDVI) [18], we leverage multiple VIs to offset weaknesses, such as saturation [44,45], that may occur within any single index. The values of the selected VIs are used in an unsupervised machine learning classification procedure that ties together Jenks Natural Breaks (JNB) and extreme gradient boosting (XGBoost), as described in Section 2.2. For Dove imagery, the simple ratio vegetation index (SR), the green normalized difference vegetation index (GNDVI) and the soil-adjusted vegetation index (SAVI) are calculated. The same indices are calculated for Sentinel-2 imagery, but with the availability of the SWIR band, the normalized burn ration (NBR) and relativized difference in normalized burn ratio (RdNBR) are also calculated. These VIs were chosen because of good performance with highly vegetated areas in both Sentinel-2 and Dove imagery [13,44,[46][47][48]. The VIs are then stacked with the original image bands.
Image masking is used in Astrape to minimize misclassification and exclude nonforested areas. Images are already assumed to have been masked for clouds and bad pixels. The system uses the Tree Canopy Cover dataset created by the USFS, which covers the contiguous United States, coastal Alaska, Hawaii and Puerto Rico and is freely available from the Multi-Resolution Land Characteristics Consortium website (https://www.mrlc. gov/data/type/tree-canopy accessed on 15 January 2021) and has been shown to have a high level of accuracy for mapping forest [49,50]. As a default, all pixels with canopy cover less than 50% are masked, which was found to produce good results during development. Astrape also allows for additional shapefiles (e.g., roads) to be used for masking as desired.

Image Segmentation
Following preprocessing, the images are segmented using all bands and VIs with the open-source image segmentation algorithm, Shepherd Segmentation, from the RSGISLib package [51,52]. Shepherd Segmentation uses the k-means algorithm and is comparable to the popular but proprietary segmentation algorithm used in e-Cognition software [31]. Importantly, this software uses the "KEA" raster format, which allows for a feature called the "raster attribute table" (RAT) that speeds up processing time when obtaining object metrics (e.g., mean, standard deviation) compared to other methods [52,53]. Leveraging the RAT feature allows Astrape to be run on a standard desktop computer rather than requiring the extensive computational power of a platform such as Google Earth Engine.
A critical component of the Astrape system is being able to optimize a key parameter in Shepherd Segmentations that adjusts the minimum number of pixels per segment, allowing it to conform to different textures, shapes and reflectances within any given image. We automate this parameter for each AOI by adapting the parameter-selection method outlined in Georganos et al. [36]. This method takes into account the variability within and among objects and the rate of change of these metrics to identify the optimal parameter setting. The testing range is set to 5-30 minimum pixels-per-segment, which has proven sufficient for providing enough data for the optimization procedure to perform effectively. The processing requirements for each parameter testing iteration can be heavy depending on the image and the size of the AOI, so we also adopt Shepherd et al.'s [37] approach, whereby a subset containing only 10% of the pixels in Dove imagery, or 1% of pixels for Sentinel-2 imagery, are used to find the optimal segmentation parameter for the whole input dataset. Once determined, the parameter value is given to the algorithm to segment the image over the entire AOI.
It is important to note that Astrape only segments the post-event image and the resulting object boundaries are then applied to both the pre-and post-event images. We prioritize the post-event image because we aim to create objects that represent areas with damage, and this is only apparent in the post-event image. Subsequent steps (described below) then provide information about the degree of change within each object, which in turn is linked to the severity of damage.

Image Differencing
In this application, the classification is based upon the change in VI values of the objects between image dates. We use the object mean to produce a single value per VI, per object. Since the same object boundaries are used for both the pre-and post-event images, we then subtract the object means (note that RdNBR must be calculated after the difference in NBR is ascertained). The image objects are then written to a shapefile and the attribute table is populated with the differenced values for each VI (and RdNBR, as applicable). This shapefile is then passed to the second module for classification.

Module Overview
Automating the classification process without using ground data requires an unsupervised classification method. Jenks Natural Breaks (JNB) was chosen because of its ability to produce a small number of classes and, critically, to provide these classes in a predictable order. JNB creates classes that are typically ordered by the amount of change in VIs, from the largest to the smallest difference. We then make the assumption that large differences in the object mean values equate to greater damage.
However, JNB only works on a single array, such as a single VI, which is a limiting factor when working with a dataset that contains multiple VIs. Regardless, we hypothesized that JNB could produce an initial set of classes and the information about those classes could be provided to a more sophisticated, but supervised, machine learning algorithm to improve the final results. In other words, we enlist JNB to perform most of the early "heavy lifting" to find and classify the damage and leverage machine learning (here, XGBoost) to refine and improve the results by utilizing all of the available information within the multi-variable (i.e., multiple VIs) dataset.

Jenks Natural Breaks
JNB was originally developed for analyzing geographic data within GIS software (e.g., ESRI ArcGIS, QGIS) [54,55]. JNB defines breaks in an ordered array such that the variance within each class is minimized, but variance among classes is maximized. To create a single array for JNB, we scaled all the VI values to a comparable magnitude and then averaged them together. We did this instead of using a single VI array to avoid overfitting by XGBoost.
To determine the optimal number of JNB classes for this averaged-VI array-another critical component of Astrape-we use a "goodness of variance fit" (GVF) measure applied to a stratified subset of the objects (100,000 objects are used per AOI, which represents 10% or greater of the total objects). GVF is calculated as: where SDAM is squared deviations from the array mean (i.e., within each class) and SDCM is the squared deviations from the class means (i.e., among classes) [56]. A higher GVF indicates breaks such that variance of the objects within each class are low, but the variance among classes is high, similar to the way in which the optimal image segmentation parameter above is calculated. We set the threshold for GVF to 0.95 (setting GVF to the maximum, 1, would result in as many classes as unique data points). To meet this threshold, Astrape runs several iterations calculating break points, starting with 4 classes and incrementally adding a class, until the number of classes produced returns a sufficiently high GVF (here, 0.95 or higher). Astrape is set to produce no more than 15 classes, regardless of the GVF, but the highest number in all of our various prototyping tests (not all described here) was 11, supporting the choice of 0.95 for the GVF limit.
Once the optimal number of break points is determined, the entire data set is classified, producing an initial set of classes. Each object is annotated with its assigned class.

XGBoost
The initial JNB-produced classes are supplied to XGBoost to further refine the classes using all of the information available in the separate VIs. Extreme gradient boosting (XGBoost) was used because of its ability to quickly and efficiently create cohesive and consistent classes [57]. XGBoost is a fast, scalable implementation of gradient boosted regression trees (GBRT), an ensemble machine learning classification technique that performs well with large, multi-variable datasets. As with the more familiar random forest approach, XGBoost uses decision trees at its baseline but builds trees in succession instead of independently. Each new tree built is a result of what has been learned from the errors in classification of the previous trees, thus producing better results with each successive tree [58,59]. XGBoost differs from normal gradient boosting trees in its ability to take a macro view of the features in the data and minimize the computation time needed to weigh their effectiveness in producing accurate classifications [57].
XGBoost requires selection of multiple parameters to produce the best possible model. In our testing, most of the default settings in the Python xgboost package yielded acceptable model performance within Astrape. Specifically, we set the max depth for each tree to equal one less than the number of VIs, we increased the minimum child weight to 10 to be more conservative and reduced the subsample size to 0.5 to prevent overfitting.
The paradox of using XGBoost in our semi-supervised approach is that we did not want perfect accuracy between the testing and training data. This would imply it had simply replicated the results produced by JNB instead of choosing the best class based upon all of the information available across all VIs. With this in mind, we averaged the VIs for the JNB portion instead of using a single VI (Section 2.2.2) and then restored the separate VI values for each object before using XGBoost. We used an averaged VI array because we found that using a single VI in the JNB process resulted in XGBoost simply "discovering" this fact and weighing that VI so heavily that it excluded the valuable information in the other VIs. In addition, we selected a relatively centralized subset of objects from each JNB class-only those objects within 1 standard deviation of the class mean-to develop the XGBoost model. This purposefully excluded objects near the edges of JNB-produced class thresholds. We did this because we hypothesized that when the XGBoost model was applied to all image objects, it would choose the class it determined to be the best fit for each object by leveraging all of the VI data, thus improving the consistency within the classes. We address this further in the discussion section.
The final step in Astrape is to use the XGBoost model to predict a class for each object. The resulting classes are then written as an attribute for each object and the map is saved as a shapefile. Because the classes are produced in a predicable order, the user can be assured that the top class determined by the system represents the most change (in this case, damage).

Introduction to the Case Studies
Here, we provide four case studies of Astrape mapping damage resulting from different disturbance events: hurricane, wildfire, derecho and tornado ( Figure 2, Table 3). In the first three studies, we show damage impacting national forest lands in different regions of the United States. The last case study depicts damage sustained across federal, state, county and private lands in Wisconsin and Michigan. Although we use different combinations of Sentinel-2 and Dove imagery, as well as different ancillary data across case studies, the Astrape framework does not change.
of JNB-produced class thresholds. We did this because we hypothesized that when the XGBoost model was applied to all image objects, it would choose the class it determined to be the best fit for each object by leveraging all of the VI data, thus improving the consistency within the classes. We address this further in the discussion section.
The final step in Astrape is to use the XGBoost model to predict a class for each object. The resulting classes are then written as an attribute for each object and the map is saved as a shapefile. Because the classes are produced in a predicable order, the user can be assured that the top class determined by the system represents the most change (in this case, damage).

Introduction to the Case Studies
Here, we provide four case studies of Astrape mapping damage resulting from different disturbance events: hurricane, wildfire, derecho and tornado ( Figure 2, Table 3). In the first three studies, we show damage impacting national forest lands in different regions of the United States. The last case study depicts damage sustained across federal, state, county and private lands in Wisconsin and Michigan. Although we use different combinations of Sentinel-2 and Dove imagery, as well as different ancillary data across case studies, the Astrape framework does not change.    Because we are mapping severe, large-scale disturbance events with short latency, reference data can be difficult to obtain and reliability in a rapid deployment setting will be anecdotal [14,20,35,58]. In two of our cases, some ground-collected reference data were available, but collecting new ground data was both arduous and dangerous. In some cases, very limited aerial reference data was available, but geo-positional alignment was questionable.
To balance the need for accuracy assessment with the safety of personnel, we opted to leverage reference data that were already available. For the majority of the studies, we had ground point data, which we compared to the same locations in the Astrape damage maps directly. For Case Study 2, we leveraged a full map of burned area (RAVG map, available at https://fsapps.nwcg.gov/ravg accessed on 3 December 2020), using the Maxwell and Warner approach to assess accuracy at the object level. The details of each reference set are described below, but all align with the traditional USFS canopy damage severity ratings: 0-25%, 25-50%, 50-75% and 75-100%. To avoid confusion with the separate map classes described later, we refer to these severity ratings as Categories A (0-25%), B (25-50%), C (50-75%) and D (75-100%) (Figure 3). In the reference data for Case Studies 2 and 3, a fifth category described as "No change," was also included, but we grouped this with Category A during assessment. We acknowledge that while these traditional categories do not offer an ideal comparison given Astrape's flexibility in setting class boundaries, they nonetheless allow us to conduct a baseline accuracy assessment that serves to demonstrate Astrape's ability to locate and map the worst of the damage without exposing field personnel to unsafe work environments.
In collaboration with numerous forestry professionals, we learned that the lower two categories (A and B) are not a high priority when foresters are dealing with severe disturbance events; they are most interested in knowing where the worst of the damage is. We therefore focus our assessment of Astrape's performance on the top damage categories, categories C and D, when comparing our results to the reference data.
Because we are mapping severe, large-scale disturbance events with short latency, reference data can be difficult to obtain and reliability in a rapid deployment setting will be anecdotal [14,20,35,58]. In two of our cases, some ground-collected reference data were available, but collecting new ground data was both arduous and dangerous. In some cases, very limited aerial reference data was available, but geo-positional alignment was questionable.
To balance the need for accuracy assessment with the safety of personnel, we opted to leverage reference data that were already available. For the majority of the studies, we had ground point data, which we compared to the same locations in the Astrape damage maps directly. For Case Study 2, we leveraged a full map of burned area (RAVG map, available at https://fsapps.nwcg.gov/ravg accessed on 3 December 2020), using the Maxwell and Warner approach to assess accuracy at the object level. The details of each reference set are described below, but all align with the traditional USFS canopy damage severity ratings: 0-25%, 25-50%, 50-75% and 75-100%. To avoid confusion with the separate map classes described later, we refer to these severity ratings as Categories A (0-25%), B (25-50%), C (50-75%) and D (75-100%) (Figure 3). In the reference data for Case Studies 2 and 3, a fifth category described as "No change," was also included, but we grouped this with Category A during assessment. We acknowledge that while these traditional categories do not offer an ideal comparison given Astrape's flexibility in setting class boundaries, they nonetheless allow us to conduct a baseline accuracy assessment that serves to demonstrate Astrape's ability to locate and map the worst of the damage without exposing field personnel to unsafe work environments.
In collaboration with numerous forestry professionals, we learned that the lower two categories (A and B) are not a high priority when foresters are dealing with severe disturbance events; they are most interested in knowing where the worst of the damage is. We therefore focus our assessment of Astrape's performance on the top damage categories, categories C and D, when comparing our results to the reference data. Hurricane-force winds caused severe windthrow, completely flattening entire sections of mature conifer and northern hardwood forests [62]. Our AOI (44,020 ha) for this case study encompasses the western portion of the Lakewood-Laona Ranger District, which suffered some of the worst and most extensive damage. The widespread damage placed high demand on reconnaissance platforms to aid response efforts and it took months before the damage in this part of the Chequamegon-Nicolet could be fully mapped. It was this catastrophic event and the difficulty our collaborators experienced in the aftermath that directly led to the development of Astrape.
We used Dove imagery in this Case Study (note: no clear Sentinel-2 data were available following the storm) ( Table 4). Images were masked using Planet's usable data mask (UDM2) and mosaics (same sensor and cross-sensor) were created using the LOESS Radiometric Correction for Contiguous Scenes (LORACCS) [63]. A final radiometric correction was done by normalizing the pre-and post-event mosaics to Harmonized Landsat-Sentinel-2 images following Leach et al. [64,65]. Astrape configurations included using the Tree Canopy Cover database for masking non-forested pixels and USGS DEM products for topographic correction (Section 2.1.2). Ground data were collected after the event, but only from roads after debris had been cleared. A total of 477 objects were assessed according to the traditional damage categories, A-D (Figure 3). We overlaid the ground data with Astrape's objects, assigning the top class to Category D (class 7), the second highest as Category C (class 6), third highest as Category B (class 5) and the remainder into Category A (classes 1-4) (Figure 4a legend). (63,000 acres) of forest damaged in the Chequamegon-Nicolet National Forest [60,61]. Hurricane-force winds caused severe windthrow, completely flattening entire sections of mature conifer and northern hardwood forests [62]. Our AOI (44,020 ha) for this case study encompasses the western portion of the Lakewood-Laona Ranger District, which suffered some of the worst and most extensive damage. The widespread damage placed high demand on reconnaissance platforms to aid response efforts and it took months before the damage in this part of the Chequamegon-Nicolet could be fully mapped. It was this catastrophic event and the difficulty our collaborators experienced in the aftermath that directly led to the development of Astrape. We used Dove imagery in this Case Study (note: no clear Sentinel-2 data were available following the storm) ( Table 4). Images were masked using Planet's usable data mask (UDM2) and mosaics (same sensor and cross-sensor) were created using the LOESS Radiometric Correction for Contiguous Scenes (LORACCS) [63]. A final radiometric correction was done by normalizing the pre-and post-event mosaics to Harmonized Landsat-Sentinel-2 images following Leach et al. [64,65]. Astrape configurations included using the Tree Canopy Cover database for masking non-forested pixels and USGS DEM products for topographic correction (Section 2.1.2). Ground data were collected after the event, but only from roads after debris had been cleared. A total of 477 objects were assessed according to the traditional damage categories, A-D (Figure 3). We overlaid the ground data with Astrape's objects, assigning the top class to Category D (class 7), the second highest as Category C (class 6), third highest as Category B (class 5) and the remainder into Category A (classes 1-4) (Figure 4a legend).   (Table 1) using MODIS imagery. While the damage pattern is similar in both maps, there are more spatial details and clearer boundaries in the Astrape map.

Results
The Astrape-produced map (Figure 4a) shows a clear pattern of impact from the derecho, with areas of damage that are in good agreement to those in the ForWarn map (Figure 4b). Overall accuracy compared to the ground reference data was 78%, with good agreement in most categories (Table 5). Errors in accuracy were generally more pronounced in Categories B and C, but Categories A and D (low and high damage levels, respectively) resulted in both producer's and user's accuracies within the 81-93% range.
Given that Astrape's classes are automated based upon the relative damage in the AOI and not explicitly set to represent the thresholds of traditional damage categories, the lower accuracy in Categories B and C is expected. However, the good agreement between ground data and predicted values in Category A indicates that Astrape is capturing the majority of the damage within the top three classes and correctly predicts the worst damage as Category D. If more damage was missed within the top three classes, a lower accuracy in Category A would be expected. Classification errors generally occurred for plus or minus one class category. For example, in Category B, most of the errors were such that objects were being misclassified as Category A or C, but they were less likely to be misclassified as Category D. This directly reflects the way JNB breaks up the initial dataset (Section 2.2.2), optimizing the breaks based upon minimizing variance between classes. It is not surprising, then, that Astrape-produced objects that are in disagreement with the ground data are still near the category for which the ground data reflected them to be. The Beachie Creek Fire in Oregon, USA, was detected on 16 August 2020 and impacted an estimated 78,336 hectares (193,573 acres) before containment was declared on 31 October 2020 [66]. The burned area included large sections of the Willamette National Forest in the Cascade Mountains in western Oregon, a primarily coniferous forest dominated by Douglas-fir (Pseudotsuga menziesii). We delineated an AOI (80,470 ha) corresponding to the primary burned area in RAVG product (Table 1) for this wildfire event.
Sentinel-2 imagery was used to map this event ( Table 6). The images were topographically corrected, masked to exclude saturated pixels, clouds, cloud shadows and water using the Sentinel-2 QI band and further masked using the RAVG mask band (the latter depicts primarily water features). This alignment with the RAVG map allowed us to use it during accuracy assessment to understand Astrape's performance on the Beachie Creek Fire area. The RAVG maps are produced for most fires in the western United States that are 1000 acres (~405 hectares) or larger and map damage severity across a number of categories, including canopy cover damage [14]. This product is generated using Landsat or Sentinel-2 imagery and classification is done via a thresholding of the RdNBR index, calibrated using ground data collected from previous fires in this region. This thresholding technique is known to work well with Categories A and D, but has potentially poor accuracy for areas with mid-range levels of damage, Categories B and C [13].
The accuracy assessment was conducting using the object-based approach method developed by Maxwell and Warner [67], in which more weight is placed on the center of an object, with less impact from fuzzy edges that may skew results. We selected "area based" normalization (more appropriate for mapping land cover characteristics), set the weighting (power) parameter to 1.5 due to uncertainty along object edges in both maps and set the saturation distance (controls how the weight is applied) to 200 m. Due to the high concentration of damage in this particular AOI, the top two classes Astrape produced were set to represent Category D (classes 5 and 6), third highest Category C (class 4), fourth highest as Category B (class 3) and the remainder were lumped into Category A (classes 2 and 1) ( Figure 5). racy for areas with mid-range levels of damage, Categories B and C [13].
The accuracy assessment was conducting using the object-based approach method developed by Maxwell and Warner [67], in which more weight is placed on the center of an object, with less impact from fuzzy edges that may skew results. We selected "area based" normalization (more appropriate for mapping land cover characteristics), set the weighting (power) parameter to 1.5 due to uncertainty along object edges in both maps and set the saturation distance (controls how the weight is applied) to 200 m. Due to the high concentration of damage in this particular AOI, the top two classes Astrape produced were set to represent Category D (classes 5 and 6), third highest Category C (class 4), fourth highest as Category B (class 3) and the remainder were lumped into Category A (classes 2 and 1) ( Figure 5).

Results
A side-by-side comparison of the RAVG and Astrape maps ( Figure 5) shows good agreement within Category D (red). The overall accuracy was 78.8%, with the highest

Results
A side-by-side comparison of the RAVG and Astrape maps ( Figure 5) shows good agreement within Category D (red). The overall accuracy was 78.8%, with the highest accuracies in Categories A and D, as expected (Table 7). For Category D specifically, good performance was seen for both errors of commission and omission, with user's and producer's accuracies of 92.7% and 88.1%, respectively. Accuracies for Categories B and C were significantly lower (which can be visually seen in the maps), but this was expected. The alignment between Astrape and RAVG within Category D is encouraging. Astrape is intended primarily as a tool to assist in directing resources to the hardest-hit areas and in this case it succeeds. Astrape produces relative damage maps and it is known the classes are unlikely to correspond exactly to the RAVG thresholds. Moreover, the RAVG map is known to contain errors in the middle classes. Given Astrape's performance with other disturbance types, it is entirely possible that it more accurately mapped Categories B and C, but this could not be verified with in-situ data.  [68]. This diverse forest contains bayous, prairies, bottomland hardwood forests and upland pine forests. The AOI (788,450 hectares) encompasses the majority of the Kisatchie National Forest within an archipelago of six polygons ( Figure 6). To provide a direct comparison of Astrape's performance using different datasets, we mapped a subset of the AOI using Dove imagery (this site is hereafter referred to as the AOI subset) (Figure 7).
To assess accuracy, a 30 m-radius buffer was used to account for some error within the observed GPS points and the majority of the Astrape-designated damage category within that buffer was assigned for that location. Accuracy was then assessed by comparing the ground-based information to the Astrape results for the buffered location.

Results
The Astrape maps reveal an unusual pattern in which the majority of the hurricane damage was sustained in riparian areas (Figures 6 and 7). While this appears suspicious, our accuracy assessment indicates that this pattern accurately characterizes in-situ damage assessments. For the Sentinel-2-based map, Astrape exhibited an overall accuracy of 78% (Table 10), with better agreement in the lower damage class than the higher class. For the Dove-based map, there was good agreement in both classes with an overall accuracy of 86% (Table 11). Most importantly, the damage pattern in the AOI subset is consistent with the damage pattern in the corresponding area of the larger AOI. Figure 6. Two-class map produced by Astrape using Sentinel-2 imagery to map Hurricane Laura damage in the large AOI, which includes six large parts of the Kisatchie National Forest (outlined in red).
Remote Sens. 2021, 13, x FOR PEER REVIEW 16 of 23 Figure 6. Two-class map produced by Astrape using Sentinel-2 imagery to map Hurricane Laura damage in the large AOI, which includes six large parts of the Kisatchie National Forest (outlined in red).

Figure 7.
Two-class map produced by Astrape using Dove imagery to map Hurricane Laura damage in the subset AOI (the western part of the Calcasieu Ranger District in the Kisatchie National Forest, located in the south west corner of the large AOI depicted in Figure 6). Table 10. Confusion matrix using damage categories grouped into two main classes (Less than 50% damage: AB; 50% damage and greater: CD) to quantify Astrape's performance with Sentinel-2 imagery on Hurricane Laura Damage ( Figure 6).

Astrape
Ground Producer's Figure 7. Two-class map produced by Astrape using Dove imagery to map Hurricane Laura damage in the subset AOI (the western part of the Calcasieu Ranger District in the Kisatchie National Forest, located in the south west corner of the large AOI depicted in Figure 6).
Sentinel-2 images were obtained from 12 June and 30 September 2020 (Table 8), while the Dove imagery was collected from 19 August and 29 September (Table 9). As in Case Study 1, Dove mosaics for this event were also corrected using LORACCS [63]. In both AOIs, images were topographically corrected and masked using the Tree Canopy Cover data. Table 8. Sentinel-2 imagery used to produce the map of damage caused by Hurricane Laura in Figure 6.
Hurricane Laura Large AOI Imagery Source: Sentinel-2

Purpose
Date/Time (UTC) Before After 12 June 2020/164,849 30 September 2020/165,019 Table 9. Dove imagery used to produce the map of damage caused by Hurricane Laura in Figure 7. Reference data included ground-based observations collected from two USFS personnel examining the damage from the roads. The damage was assessed based upon the traditional damage categories ( Figure 3) and resulted in 124 points for the AOI and 42 points for the smaller AOI subset. Due to the limited ground data available, we only use a two-class approach in this location: >50% damage (i.e., Category CD) and <50% damage (i.e., Category AB).

Hurricane Laura Small AOI Imagery
To assess accuracy, a 30 m-radius buffer was used to account for some error within the observed GPS points and the majority of the Astrape-designated damage category within that buffer was assigned for that location. Accuracy was then assessed by comparing the ground-based information to the Astrape results for the buffered location.

Results
The Astrape maps reveal an unusual pattern in which the majority of the hurricane damage was sustained in riparian areas (Figures 6 and 7). While this appears suspicious, our accuracy assessment indicates that this pattern accurately characterizes in-situ damage assessments. For the Sentinel-2-based map, Astrape exhibited an overall accuracy of 78% (Table 10), with better agreement in the lower damage class than the higher class. For the Dove-based map, there was good agreement in both classes with an overall accuracy of 86% (Table 11). Most importantly, the damage pattern in the AOI subset is consistent with the damage pattern in the corresponding area of the larger AOI. Table 10. Confusion matrix using damage categories grouped into two main classes (Less than 50% damage: AB; 50% damage and greater: CD) to quantify Astrape's performance with Sentinel-2 imagery on Hurricane Laura Damage ( Figure 6). Cohen's Kappa: 0.50 Table 11. Confusion matrix using damage categories grouped into two main classes (Less than 50% damage: AB; 50% damage and greater: CD) to quantify Astrape's performance with Dove imagery on Hurricane Laura Damage (Figure 7). This case study demonstrates the similarity of the results when using the same methods with Sentinel-2 and Dove imagery, which is consistent with results reported elsewhere [19]. The similar damage pattern produced in the subset AOI in both the Sentinel-2 and Dove maps demonstrates Astrape's flexibility and ability to highlight the worst of the damage even when the size of the AOI, imagery sources and imagery dates all differ. The clear benefit of using Sentinel-2 in this case is the economical coverage over a large area while still offering 10 m spatial resolution. The use of high spatial-resolution Dove imagery in the smaller AOI demonstrates comparable results, despite being limited to four bands. In this case, one of the primary benefits of using Dove imagery-the high temporal resolution-was unfortunately diminished due to persistent cloud cover following Hurricane Laura. In this case study, we apply Astrape to "explore" and map previously unknown locations of tornado damage resulting from a large storm in northern Wisconsin and Michigan on 9 August 2020. In this case, we were asked by collaborating forest managers to run the system in an effort to locate the exact paths of the tornado damage. UAV support had been requested by our collaborators to help with the mapping effort, but their request was denied. The newly-developed Astrape offered a welcome tool for necessary reconnaissance. Thus, we first ran Astrape over a large area and focused solely on the top damage class it produced. This enabled us to quickly locate three tornado paths, for which we then drew new AOIs and produced maps for each tornado path individually. Sentinel-2 imagery corrected to surface reflectance from 17 July and 11 August 2020 were used; masking relied on a layer derived from the LANDFIRE database (https://www. landfire.gov/lfrdb.php accessed on 15 January 2021) to exclude non-forested areas.

Astrape
Due to travel restrictions associated with the COVID-19 pandemic, we were unable to obtain enough reference data to perform an accuracy assessment for this site. Feedback from collaborators in the Wisconsin DNR and USFS included limited ground data and handheld photographs taken from a plane, with a consensus that the top two damage classes aligned with their in-situ observations for representing the worst of the damage.

Results
The three tornado paths located in "explore mode" (red, Figure 8a), are shown in Figure 8b-d. Spatially, the outline of the tornado damage is very clear. There is no uncertainty regarding the path of the tornados and the location of the most severe damage. Our collaborators indicated the maps aligned closely with data collected from their limited ground work and hand-held aerial photography, as delineated by the black line in Figure 8d and photograph provided as an inset on Figure 8b. masking relied on a layer derived from the LANDFIRE database (https://www.landfire.gov/lfrdb.php accessed on 15 January 2021) to exclude non-forested areas.
Due to travel restrictions associated with the COVID-19 pandemic, we were unable to obtain enough reference data to perform an accuracy assessment for this site. Feedback from collaborators in the Wisconsin DNR and USFS included limited ground data and handheld photographs taken from a plane, with a consensus that the top two damage classes aligned with their in-situ observations for representing the worst of the damage.

Results
The three tornado paths located in "explore mode" (red, Figure 8a), are shown in Figures 8b-d. Spatially, the outline of the tornado damage is very clear. There is no uncertainty regarding the path of the tornados and the location of the most severe damage. Our collaborators indicated the maps aligned closely with data collected from their limited ground work and hand-held aerial photography, as delineated by the black line in Figure  8d and photograph provided as an inset on Figure 8b. (c) One of the three tornado paths found and subsequently mapped (d) The black outline is a limited area walked by USFS personnel to map the damage before these maps were generated (the western portion had not been walked yet; the white area on the eastern side was masked due to a cloud). (c) One of the three tornado paths found and subsequently mapped (d) The black outline is a limited area walked by USFS personnel to map the damage before these maps were generated (the western portion had not been walked yet; the white area on the eastern side was masked due to a cloud).
Specifically, for the tornado depicted in Figure 8b, a Wisconsin DNR forest health professional was able to provide a few damage readings from handheld photography she obtained from aviators who flew an aerial reconnaissance of the area. After assessing the damage, she was able to use landmarks and her knowledge of the area to help correlate the location on in the AOI. Overall, the Astrape damage labels correspond directly to her reference label in nearly all cases, with all five "heavily damaged" areas labeled as "most damaged" by Astrape. Areas with lesser amounts were far more difficult to assess, however. Given the nature of the reference data, we interpret these results with a simple appreciation that the Astrape map shows great potential for locating tornado damage, especially in the highest classes.

Discussion
In this work, we develop and test Astrape in several locations and disturbance events with vastly different AOI sizes (Table 3), using both Planet Dove and Sentinel-2 imagery. The results show that the system performed well with the same flexible, unaltered framework in each case study and has already proven effective for operational use. The key components that allow Astrape to perform well across different regions and disturbance types are the automated optimizations that select image segmentation parameters and the number of damage classes JNB produces. Combined, these two powerful techniques allow Astrape to conform to different satellite imagery sources, forest types, topography and extents of damage. While we have only tested Astrape on Sentinel-2 and Dove, the framework also supports Landsat imagery and could be applied to other sources, such aerial or UAV imagery.
One of the key benefits of Astrape is its flexibility in developing the number and type of classes necessary to map damage in a given AOI. In some cases, the top class may represent 75-100% canopy damage. In others, it may indicate a different level, such as 40% damage, because the most severe damage in that particular area was only 40%. While Astrape cannot provide exact estimates of forest damage for each class on its own, there are, in fact, few systems that can; HiForm and ORS, for example, both recommend a comparable amount of ground verification. If a spatially detailed (<10 m resolution) and timely map of forest damage is available-such as that from Astrape-many forest managers are happy to do the spot checks necessary to ascertain the additional damage information (personal correspondence with dozens of collaborating forest managers). Simply knowing where to focus their attention is very valuable information, even when a definitive damage category is unknown.
The measures to keep XGBoost from simply replicating the JNB classification (Section 2.2.2) and force it to instead leverage all VI data were successful in increasing accuracy by 4-5 percentage points during development with two sites (the Wisconsin derecho data in Case Study 1 and another similar area owned by the Wisconsin DNR hit by the same storm for which we also had ground data). This substantiated our hypothesis that XGBoost is able to refine the JNB classes, albeit only in two similar cases. We believe this warrants further investigation, but the processing cost and time to run XGBoost is minimal, unlikely to make results worse and we believe worth the potential improvement in map quality.
Astrape does have limitations, however. Shadows can cause errors, which is a frequent obstacle in most high-spatial resolution image applications. Shadows are typically caused by topography and sun angle, which becomes more pronounced with increasing temporal separation (both from time of acquisition and day of year) [69]. Topographic correction, while useful, does not completely solve this issue when shadows are a result of terrain, such as in mountains. If possible, imagery should be obtained as temporally close as possible to reduce topographic shading and reflectance errors [69]. With high spatial resolution imagery, shadows cast by tree crowns can also be problematic. With the Dove imagery, we especially noted errors from shadows cast by trees along the edges of a clearings (road, meadow, etc.). While it is possible to mask out shadows, doing so may be problematic in this particular application because trees along an edge are especially susceptible to damage from strong winds; masking these areas could inadvertently exclude real damage from analysis. However, in forested areas without dramatic topographic or landcover variation, the benefits of shadow masking are not likely worth the risk of losing valuable forest damage information, though users should be cognizant of this potential for error when using the maps.
Related to this issue of image timing is the impact of phenological changes between images; Astrape can be challenged when events such as spring green-up and fall senescence lead to differences in reflectance that the system interprets as change. Astrape is built on the premise that changes in reflectance equate directly to levels of damage and this is both its benefit and its downfall. Unlike long-term, time-series-based change detection methods, Astrape has no way to account for phenological differences and, thus, use of Astrape is generally limited to severe disturbances during the summer season. It might be possible to account for phenology by including ancillary data, but in cases such as these, it may be more prudent to simply explore other mapping options better suited to accounting for phenological challenges (e.g., LandTrendr or ORS).
Astrape was designed specifically for low-latency operational use to respond to severe disturbance events and is best viewed as a complementary option among other wellestablished disturbance mapping tools such as ForWarn, HiForm, RAVG and airborne assets. Astrape helps fills a niche for safe, high-spatial image exploitation to produce severe Remote Sens. 2021, 13, 1634 19 of 23 disturbance maps. The decision on which mapping method to use will depend upon the known capabilities of the chosen system, timeline, costs, safety and goals of forest managers. For example, large wildfires in the western United States are already well-mapped by the USFS's RAVG products. However, RAVG is not calibrated for other areas in the United States and given its good performance on the Beachie Creek Fire, Astrape may prove to be a more flexible tool in other regions. Astrape's ability to leverage high-temporal Dove imagery also makes it unusual in its ability to respond to events more quickly and provide high-spatial resolution maps, as seen in the Wisconsin derecho case study. Nevertheless, it still may suffer from delays due to a lack of cloud-free imagery, such as seen in the Hurricane Laura case study. There will undoubtably be times when aerial reconnaissance flights or UAVs remain the best option for obtaining a quick overview of damage. Case Study 4 in particular highlights the potential for Astrape to aid forest managers following severe disturbance events. Because of COVID-19 restrictions, detailed aerial surveys with forest health professionals were unable to be conducted, limiting the ability of the forest managers to obtain a good understanding of the extent of the damage. Astrape maps were generated using imagery acquired just three days after the storm, greatly improving knowledge of the location and severity of damage and aiding foresters in their response efforts on the ground.
Regarding the explore mode used in Case Study 4, especially at this scale, we acknowledge that Astrape is not unique. Systems like ForWarn and newer Landsat-based systems ( Table 1) have this capability. However, this case study illustrates Astrape's capacity for near-real time mapping of damage. It is also possible to run Astrape with Sentinel-2 or Landsat data to perform this initial step and then run it again with Dove imagery in severely affected areas (i.e., similar to the Hurricane Laura case study). Another option is to create AOIs from the results of a system such as ForWarn and use them with Astrape to map areas with higher spatial resolution data, combining these systems in a synergistic way. We believe there is much potential in this regard.
Future efforts include testing Astrape on different disturbance types. We recognize that three of the case studies presented here-derecho, hurricane and tornado damage-are all variants of wind damage. While windthrow and wildfire are major causes of damage, we are continuing to use Astrape on other disturbance agents not shown here, such as severe defoliation, flooding and landslides, to better understand how it performs in new situations.
Additional future efforts also include investigating the ability to calibrate Astrape's results after ground verification has been done. For that purpose, a third module has been built that allows for supervised classification. The automated values of the JNB breaks are provided as output by Astrape and these can be manually adjusted to better fit collected ground data. The calibrated values can then be provided as input to the supervised module which preforms the same functions as Automated Classification Module, but "skips" the JNB portion (Section 2.2.2). However, this module was not tested here primarily because the original maps were sufficient for our collaborators' management needs and remains an area of future study. In addition, the unclassified differenced map generated in Astrape's Image Segmentation and Differencing module (Figure 1) is also provided as an output and could be used with other supervised methods, offering further areas for future research.

Conclusions
Astrape was built specifically for operational use on a wide variety of abiotic severe forest disturbance types and locations and offers a flexible, effective and efficient approach to early response mapping. Able to complement other mapping tools already in operation, it fills a niche by offering a unique, automated GEOBIA framework capable of leveraging high-spatial resolution satellite imagery to map forest damage at fine spatial and temporal scale. Astrape's strengths are its flexibility to leverage different sources of imagery, ability to conform to various AOI sizes by automating and optimizing the segmentation process, and consistent production of meaningful classes relative to a given disturbance event through framework tying together JNB and XGBoost. Astrape does not eliminate the need for fieldwork to verify the damage level for the resulting classes, but it can greatly reduce the dependence on field sampling and, by extension, the time, costs and risks associated with it. Astrape is a change detection system and caution should be used when applying Astrape to areas during periods of prominent phenological changes or when the landcover is not primarily forest. Imagery used should be chosen as temporally close as possible to mitigate the effects of phenology and errors from shadows due to sun angle. Astrape's maps have already proven useful to forest managers in severe disturbance events and future work will be focused on continuing to improve Astrape while simultaneously making it available for sustained operational use. Our goal is to work with our collaborators at the federal, state, county and private levels to integrate Astrape with other operational systems already available. Data Availability Statement: Publicly available datasets and imagery referenced in this study can be found as described in the text. For all other data (i.e., ground reference data), please contact the corresponding author.